Feature engineering for electricity load forecasting#
The purpose of this notebook is to demonstrate how to use skrub and polars
to perform feature engineering for electricity load forecasting.
We will build a set of features from different sources:
Historical weather data for 10 medium to large urban areas in France;
Holidays and calendar features for France;
Historical electricity load data for the whole of France.
All these data sources cover a time range from March 23, 2021 to May 31, 2025.
Since our maximum forecasting horizon is 24 hours, we consider that the future weather data is known at a chosen prediction time. Similarly, the holidays and calendar features are known at prediction time for any point in the future.
Therefore, features derived from the weather and calendar data can be used to engineer “future covariates”. Since the load data is our prediction target, we will can also use it to engineer “past covariates” such as lagged features and rolling aggregations.
Environment setup#
We need to install some extra dependencies for this notebook if needed (when running jupyterlite). We need the development version of skrub to be able to use the skrub expressions.
%pip install -q https://pypi.anaconda.org/ogrisel/simple/polars/1.24.0/polars-1.24.0-cp39-abi3-emscripten_3_1_58_wasm32.whl
%pip install -q altair holidays https://pypi.anaconda.org/ogrisel/simple/skrub/0.6.dev0/skrub-0.6.dev0-py3-none-any.whl
ERROR: polars-1.24.0-cp39-abi3-emscripten_3_1_58_wasm32.whl is not a supported wheel on this platform.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
# The following 3 imports are only needed to workaround some limitations
# when using polars in a pyodide/jupyterlite notebook.
import tzdata # noqa: F401
import pandas as pd
from pyarrow.parquet import read_table
import polars as pl
import skrub
from pathlib import Path
import holidays
import warnings
# Ignore warnings from pkg_resources triggered by Python 3.13's multiprocessing.
warnings.filterwarnings("ignore", category=UserWarning, module="pkg_resources")
Time range#
Let’s define a hourly time range from March 23, 2021 to May 31, 2025 that will be used to join the electricity load data and the weather data. The time range is in UTC timezone to avoid any ambiguity when joining with the weather data that is also in UTC.
We wrap the polars dataframe in a skrub variable to benefit from the built-in TableReport display in the notebook. Using the skrub expression system will also be useful later.
time_range_start = pl.datetime(2021, 3, 23, hour=0, time_zone="UTC")
time_range_end = pl.datetime(2025, 5, 31, hour=23, time_zone="UTC")
time = skrub.var(
"time",
pl.DataFrame().with_columns(
pl.datetime_range(
start=time_range_start,
end=time_range_end,
time_zone="UTC",
interval="1h",
).alias("time"),
),
)
time
Show graph
| time |
|---|
| 2021-03-23 00:00:00+00:00 |
| 2021-03-23 01:00:00+00:00 |
| 2021-03-23 02:00:00+00:00 |
| 2021-03-23 03:00:00+00:00 |
| 2021-03-23 04:00:00+00:00 |
| 2025-05-31 19:00:00+00:00 |
| 2025-05-31 20:00:00+00:00 |
| 2025-05-31 21:00:00+00:00 |
| 2025-05-31 22:00:00+00:00 |
| 2025-05-31 23:00:00+00:00 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36744 (100.0%) | 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
To avoid network issues when running this notebook, the necessary data
files have already been downloaded and saved in the datasets folder.
See the README.md file for instructions to download the data manually
if you want to re-run this notebook with more recent data.
data_source_folder = Path("../datasets")
for data_file in sorted(data_source_folder.iterdir()):
print(data_file)
../datasets/README.md
../datasets/Total Load - Day Ahead _ Actual_202101010000-202201010000.csv
../datasets/Total Load - Day Ahead _ Actual_202201010000-202301010000.csv
../datasets/Total Load - Day Ahead _ Actual_202301010000-202401010000.csv
../datasets/Total Load - Day Ahead _ Actual_202401010000-202501010000.csv
../datasets/Total Load - Day Ahead _ Actual_202501010000-202601010000.csv
../datasets/weather_bayonne.parquet
../datasets/weather_brest.parquet
../datasets/weather_lille.parquet
../datasets/weather_limoges.parquet
../datasets/weather_lyon.parquet
../datasets/weather_marseille.parquet
../datasets/weather_nantes.parquet
../datasets/weather_paris.parquet
../datasets/weather_strasbourg.parquet
../datasets/weather_toulouse.parquet
List of 10 medium to large urban areas to approximately cover most regions in France with a slight focus on most populated regions that are likely to drive electricity demand.
city_names = [
"paris",
"lyon",
"marseille",
"toulouse",
"lille",
"limoges",
"nantes",
"strasbourg",
"brest",
"bayonne",
]
all_city_weather_raw = {}
for city_name in city_names:
# all_city_weather_raw[city_name] = skrub.var(
# f"{city_name}_weather_raw",
all_city_weather_raw[city_name] = (
pl.from_arrow(read_table(f"../datasets/weather_{city_name}.parquet"))
).with_columns(
[
pl.col("time").dt.cast_time_unit(
"us"
), # Ensure time column has the same type
]
)
all_city_weather_raw["brest"]
| time | temperature_2m | precipitation | wind_speed_10m | cloud_cover | soil_moisture_1_to_3cm | relative_humidity_2m |
|---|---|---|---|---|---|---|
| datetime[μs, UTC] | f32 | f32 | f32 | f32 | f32 | f32 |
| 2021-01-01 00:00:00 UTC | null | null | null | null | null | null |
| 2021-01-01 01:00:00 UTC | null | null | null | null | null | null |
| 2021-01-01 02:00:00 UTC | null | null | null | null | null | null |
| 2021-01-01 03:00:00 UTC | null | null | null | null | null | null |
| 2021-01-01 04:00:00 UTC | null | null | null | null | null | null |
| … | … | … | … | … | … | … |
| 2025-05-31 19:00:00 UTC | 17.5175 | 0.0 | 12.0694 | 5.0 | 0.168 | 73.0 |
| 2025-05-31 20:00:00 UTC | 16.2675 | 0.0 | 9.114471 | 99.0 | 0.168 | 77.0 |
| 2025-05-31 21:00:00 UTC | 15.5175 | 0.0 | 7.559999 | 93.0 | 0.169 | 84.0 |
| 2025-05-31 22:00:00 UTC | 15.5675 | 0.0 | 9.0 | 100.0 | 0.17 | 82.0 |
| 2025-05-31 23:00:00 UTC | 15.5675 | 0.0 | 5.506941 | 100.0 | 0.171 | 81.0 |
all_city_weather_raw["brest"].drop_nulls(subset=["temperature_2m"])
| time | temperature_2m | precipitation | wind_speed_10m | cloud_cover | soil_moisture_1_to_3cm | relative_humidity_2m |
|---|---|---|---|---|---|---|
| datetime[μs, UTC] | f32 | f32 | f32 | f32 | f32 | f32 |
| 2021-03-23 00:00:00 UTC | 4.628 | null | 10.086427 | null | null | 94.0 |
| 2021-03-23 01:00:00 UTC | 5.028 | 0.0 | 11.183201 | 6.0 | null | 95.0 |
| 2021-03-23 02:00:00 UTC | 5.078 | 0.0 | 10.966713 | 6.0 | null | 94.0 |
| 2021-03-23 03:00:00 UTC | 4.628 | 0.0 | 10.464797 | 5.0 | null | 93.0 |
| 2021-03-23 04:00:00 UTC | 4.428 | 0.0 | 10.464797 | 5.0 | null | 92.0 |
| … | … | … | … | … | … | … |
| 2025-05-31 19:00:00 UTC | 17.5175 | 0.0 | 12.0694 | 5.0 | 0.168 | 73.0 |
| 2025-05-31 20:00:00 UTC | 16.2675 | 0.0 | 9.114471 | 99.0 | 0.168 | 77.0 |
| 2025-05-31 21:00:00 UTC | 15.5175 | 0.0 | 7.559999 | 93.0 | 0.169 | 84.0 |
| 2025-05-31 22:00:00 UTC | 15.5675 | 0.0 | 9.0 | 100.0 | 0.17 | 82.0 |
| 2025-05-31 23:00:00 UTC | 15.5675 | 0.0 | 5.506941 | 100.0 | 0.171 | 81.0 |
all_city_weather = time.skb.eval()
for city_name, city_weather_raw in all_city_weather_raw.items():
all_city_weather = all_city_weather.join(
city_weather_raw.rename(lambda x: x if x == "time" else x + "_" + city_name),
on="time",
how="inner",
)
all_city_weather = skrub.var(
"all_city_weather",
all_city_weather,
)
all_city_weather
Show graph
| time | temperature_2m_paris | precipitation_paris | wind_speed_10m_paris | cloud_cover_paris | soil_moisture_1_to_3cm_paris | relative_humidity_2m_paris | temperature_2m_lyon | precipitation_lyon | wind_speed_10m_lyon | cloud_cover_lyon | soil_moisture_1_to_3cm_lyon | relative_humidity_2m_lyon | temperature_2m_marseille | precipitation_marseille | wind_speed_10m_marseille | cloud_cover_marseille | soil_moisture_1_to_3cm_marseille | relative_humidity_2m_marseille | temperature_2m_toulouse | precipitation_toulouse | wind_speed_10m_toulouse | cloud_cover_toulouse | soil_moisture_1_to_3cm_toulouse | relative_humidity_2m_toulouse | temperature_2m_lille | precipitation_lille | wind_speed_10m_lille | cloud_cover_lille | soil_moisture_1_to_3cm_lille | relative_humidity_2m_lille | temperature_2m_limoges | precipitation_limoges | wind_speed_10m_limoges | cloud_cover_limoges | soil_moisture_1_to_3cm_limoges | relative_humidity_2m_limoges | temperature_2m_nantes | precipitation_nantes | wind_speed_10m_nantes | cloud_cover_nantes | soil_moisture_1_to_3cm_nantes | relative_humidity_2m_nantes | temperature_2m_strasbourg | precipitation_strasbourg | wind_speed_10m_strasbourg | cloud_cover_strasbourg | soil_moisture_1_to_3cm_strasbourg | relative_humidity_2m_strasbourg | temperature_2m_brest | precipitation_brest | wind_speed_10m_brest | cloud_cover_brest | soil_moisture_1_to_3cm_brest | relative_humidity_2m_brest | temperature_2m_bayonne | precipitation_bayonne | wind_speed_10m_bayonne | cloud_cover_bayonne | soil_moisture_1_to_3cm_bayonne | relative_humidity_2m_bayonne |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2021-03-23 00:00:00+00:00 | 6.414499759674072 | 3.5999999046325684 | 61.0 | 2.7850000858306885 | 10.182336807250977 | 77.0 | 11.078999519348145 | 10.464797973632812 | 49.0 | 5.832499980926514 | 7.787990570068359 | 73.0 | 4.650000095367432 | 6.6380720138549805 | 86.0 | -0.3489999771118164 | 5.001279830932617 | 98.0 | 3.0280001163482666 | 6.849466800689697 | 83.0 | 4.436999797821045 | 4.553679466247559 | 81.0 | 4.628000259399414 | 10.086426734924316 | 94.0 | 4.3480000495910645 | 6.489992141723633 | 91.0 | ||||||||||||||||||||||||||||||
| 2021-03-23 01:00:00+00:00 | 6.014500141143799 | 0.0 | 3.545588731765747 | 6.0 | 62.0 | 2.384999990463257 | 0.0 | 8.942214012145996 | 6.0 | 78.0 | 10.729000091552734 | 0.0 | 11.18320083618164 | 0.0 | 50.0 | 5.28249979019165 | 0.0 | 6.696386814117432 | 5.0 | 74.0 | 4.300000190734863 | 0.0 | 7.1003098487854 | 22.0 | 88.0 | -0.8989999890327454 | 0.0 | 5.154415130615234 | 10.0 | 98.0 | 2.628000020980835 | 0.0 | 9.422101020812988 | 7.0 | 81.0 | 3.63700008392334 | 0.0 | 4.213691711425781 | 66.0 | 84.0 | 5.0279998779296875 | 0.0 | 11.18320083618164 | 6.0 | 95.0 | 3.8980000019073486 | 0.0 | 5.483356475830078 | 19.0 | 92.0 | ||||||||||
| 2021-03-23 02:00:00+00:00 | 5.7144999504089355 | 0.0 | 3.415259599685669 | 5.0 | 64.0 | 1.9850000143051147 | 0.0 | 6.915374279022217 | 12.0 | 79.0 | 10.328999519348145 | 0.0 | 11.570514678955078 | 0.0 | 51.0 | 4.732499599456787 | 0.0 | 6.287129878997803 | 0.0 | 75.0 | 4.050000190734863 | 0.0 | 7.4215898513793945 | 72.0 | 89.0 | -1.499000072479248 | 0.0 | 5.154415130615234 | 12.0 | 98.0 | 2.2780001163482666 | 0.0 | 10.446206092834473 | 100.0 | 81.0 | 3.0369999408721924 | 0.0 | 4.553679466247559 | 88.0 | 87.0 | 5.078000068664551 | 0.0 | 10.966712951660156 | 6.0 | 94.0 | 3.6480000019073486 | 0.0 | 5.399999618530273 | 96.0 | 92.0 | ||||||||||
| 2021-03-23 03:00:00+00:00 | 5.364500045776367 | 0.0 | 3.239999771118164 | 11.0 | 65.0 | 1.6349999904632568 | 0.0 | 5.399999618530273 | 84.0 | 79.0 | 9.928999900817871 | 0.0 | 11.609650611877441 | 0.0 | 51.0 | 4.332499980926514 | 0.0 | 5.506940841674805 | 0.0 | 75.0 | 3.799999952316284 | 0.0 | 7.9932966232299805 | 73.0 | 90.0 | -1.9489998817443848 | 0.0 | 5.315336227416992 | 17.0 | 98.0 | 1.777999997138977 | 0.0 | 10.464797973632812 | 100.0 | 85.0 | 2.736999988555908 | 0.0 | 4.6938252449035645 | 100.0 | 88.0 | 4.628000259399414 | 0.0 | 10.464797019958496 | 5.0 | 93.0 | 3.3980000019073486 | 0.0 | 5.091168403625488 | 67.0 | 92.0 | ||||||||||
| 2021-03-23 04:00:00+00:00 | 5.064499855041504 | 0.0 | 3.3190360069274902 | 11.0 | 66.0 | 1.5350000858306885 | 0.0 | 4.693825721740723 | 100.0 | 79.0 | 9.678999900817871 | 0.0 | 11.440978050231934 | 0.0 | 52.0 | 3.9825000762939453 | 0.0 | 4.896529197692871 | 0.0 | 76.0 | 3.450000047683716 | 0.0 | 7.289444923400879 | 68.0 | 90.0 | -2.0989999771118164 | 0.0 | 5.991593837738037 | 24.0 | 98.0 | 1.4279999732971191 | 0.0 | 10.182336807250977 | 87.0 | 90.0 | 2.437000036239624 | 0.0 | 4.452953815460205 | 100.0 | 88.0 | 4.427999973297119 | 0.0 | 10.464797019958496 | 5.0 | 92.0 | 3.0980000495910645 | 0.0 | 5.692099094390869 | 12.0 | 92.0 | ||||||||||
| 2025-05-31 19:00:00+00:00 | 24.26500129699707 | 0.10000000149011612 | 9.007196426391602 | 100.0 | 0.2680000066757202 | 60.0 | 21.861000061035156 | 0.0 | 3.545588731765747 | 100.0 | 0.27399998903274536 | 64.0 | 19.91699981689453 | 0.0 | 10.464797019958496 | 0.0 | 0.13899999856948853 | 87.0 | 29.32000160217285 | 0.0 | 6.989935874938965 | 7.0 | 0.20800000429153442 | 45.0 | 24.030498504638672 | 0.0 | 14.399999618530273 | 100.0 | 0.2669999897480011 | 59.0 | 25.62350082397461 | 0.0 | 10.594036102294922 | 19.0 | 0.15700000524520874 | 53.0 | 21.73699951171875 | 0.0 | 12.245292663574219 | 100.0 | 0.17299999296665192 | 54.0 | 19.89349937438965 | 0.0 | 10.685391426086426 | 84.0 | 0.3149999976158142 | 79.0 | 17.517499923706055 | 0.0 | 12.0693998336792 | 5.0 | 0.1679999977350235 | 73.0 | 18.826499938964844 | 0.0 | 12.096214294433594 | 100.0 | 0.23800000548362732 | 83.0 |
| 2025-05-31 20:00:00+00:00 | 23.364999771118164 | 0.0 | 9.605998039245605 | 100.0 | 0.26899999380111694 | 62.0 | 21.56100082397461 | 0.0 | 2.545584201812744 | 80.0 | 0.27300000190734863 | 72.0 | 19.66699981689453 | 0.0 | 6.618519306182861 | 100.0 | 0.13899999856948853 | 87.0 | 27.3700008392334 | 0.0 | 5.399999618530273 | 100.0 | 0.20800000429153442 | 54.0 | 21.430500030517578 | 0.0 | 12.959999084472656 | 100.0 | 0.2669999897480011 | 61.0 | 23.323501586914062 | 0.0 | 8.654986381530762 | 23.0 | 0.15600000321865082 | 66.0 | 20.437000274658203 | 0.0 | 10.308830261230469 | 100.0 | 0.17299999296665192 | 58.0 | 20.14349937438965 | 0.0 | 6.287129878997803 | 94.0 | 0.3059999942779541 | 75.0 | 16.267499923706055 | 0.0 | 9.114471435546875 | 99.0 | 0.1679999977350235 | 77.0 | 18.476499557495117 | 0.0 | 11.631956100463867 | 100.0 | 0.23899999260902405 | 84.0 |
| 2025-05-31 21:00:00+00:00 | 22.46500015258789 | 0.0 | 13.854154586791992 | 100.0 | 0.2709999978542328 | 65.0 | 21.111000061035156 | 0.0 | 2.545584201812744 | 71.0 | 0.27300000190734863 | 76.0 | 17.91699981689453 | 0.0 | 7.386581897735596 | 100.0 | 0.13899999856948853 | 96.0 | 26.020000457763672 | 0.0 | 4.6102495193481445 | 6.0 | 0.20999999344348907 | 59.0 | 21.08049964904785 | 0.0 | 15.480000495910645 | 100.0 | 0.2669999897480011 | 61.0 | 21.62350082397461 | 0.0 | 7.636752605438232 | 100.0 | 0.1550000011920929 | 73.0 | 19.336999893188477 | 0.0 | 6.489992141723633 | 100.0 | 0.17399999499320984 | 63.0 | 19.443500518798828 | 0.0 | 4.349896430969238 | 94.0 | 0.30300000309944153 | 78.0 | 15.517499923706055 | 0.0 | 7.559999465942383 | 93.0 | 0.16899999976158142 | 84.0 | 18.226499557495117 | 0.0 | 10.144082069396973 | 100.0 | 0.23899999260902405 | 84.0 |
| 2025-05-31 22:00:00+00:00 | 20.96500015258789 | 0.0 | 9.6932954788208 | 95.0 | 0.2709999978542328 | 63.0 | 20.661001205444336 | 0.0 | 3.617955207824707 | 12.0 | 0.27300000190734863 | 80.0 | 17.567001342773438 | 0.0 | 4.6938252449035645 | 100.0 | 0.13899999856948853 | 96.0 | 24.57000160217285 | 0.0 | 3.396233081817627 | 100.0 | 0.210999995470047 | 71.0 | 18.58049964904785 | 0.0 | 10.440000534057617 | 38.0 | 0.2669999897480011 | 68.0 | 20.273500442504883 | 0.0 | 5.399999618530273 | 100.0 | 0.1550000011920929 | 81.0 | 18.48699951171875 | 0.0 | 6.725354194641113 | 100.0 | 0.17399999499320984 | 75.0 | 19.39349937438965 | 0.0 | 3.096837043762207 | 100.0 | 0.30000001192092896 | 80.0 | 15.567500114440918 | 0.0 | 9.0 | 100.0 | 0.17000000178813934 | 82.0 | 17.576499938964844 | 0.0 | 4.452953815460205 | 100.0 | 0.23999999463558197 | 90.0 |
| 2025-05-31 23:00:00+00:00 | 19.96500015258789 | 0.0 | 10.308831214904785 | 90.0 | 0.2720000147819519 | 64.0 | 19.961000442504883 | 0.0 | 3.2599384784698486 | 63.0 | 0.27300000190734863 | 83.0 | 17.567001342773438 | 0.0 | 2.595996856689453 | 100.0 | 0.13899999856948853 | 96.0 | 23.470001220703125 | 0.0 | 4.0249223709106445 | 100.0 | 0.21299999952316284 | 80.0 | 16.730499267578125 | 0.0 | 10.799999237060547 | 18.0 | 0.2669999897480011 | 78.0 | 19.223501205444336 | 0.0 | 4.735060214996338 | 16.0 | 0.1550000011920929 | 87.0 | 17.73699951171875 | 0.0 | 8.55710220336914 | 72.0 | 0.17499999701976776 | 81.0 | 19.293498992919922 | 0.0 | 3.617955207824707 | 100.0 | 0.296999990940094 | 81.0 | 15.567500114440918 | 0.0 | 5.506940841674805 | 100.0 | 0.17100000381469727 | 81.0 | 18.226499557495117 | 0.0 | 8.089993476867676 | 100.0 | 0.23999999463558197 | 83.0 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
temperature_2m_paris
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,439 (3.9%)
- Mean ± Std
- 13.6 ± 6.99
- Median ± IQR
- 13.2 ± 9.80
- Min | Max
- -5.13 | 40.6
precipitation_paris
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 135 (0.4%)
- Mean ± Std
- 0.0911 ± 0.541
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 29.5
wind_speed_10m_paris
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,777 (4.8%)
- Mean ± Std
- 10.0 ± 5.28
- Median ± IQR
- 9.29 ± 7.18
- Min | Max
- 0.00 | 50.1
cloud_cover_paris
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 69.2 ± 39.9
- Median ± IQR
- 97.0 ± 71.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_paris
Float32- Null values
- 14,414 (39.2%)
- Unique values
- 277 (0.8%)
- Mean ± Std
- 0.298 ± 0.0397
- Median ± IQR
- 0.304 ± 0.0440
- Min | Max
- 0.139 | 0.436
relative_humidity_2m_paris
Float32- Null values
- 0 (0.0%)
- Unique values
- 91 (0.2%)
- Mean ± Std
- 69.7 ± 18.1
- Median ± IQR
- 73.0 ± 27.0
- Min | Max
- 10.0 | 100.
temperature_2m_lyon
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,565 (4.3%)
- Mean ± Std
- 14.1 ± 7.96
- Median ± IQR
- 13.8 ± 11.3
- Min | Max
- -5.89 | 40.3
precipitation_lyon
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 150 (0.4%)
- Mean ± Std
- 0.0989 ± 0.608
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 26.3
wind_speed_10m_lyon
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,729 (4.7%)
- Mean ± Std
- 8.07 ± 6.04
- Median ± IQR
- 6.48 ± 7.67
- Min | Max
- 0.00 | 43.2
cloud_cover_lyon
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 64.4 ± 41.8
- Median ± IQR
- 92.0 ± 88.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_lyon
Float32- Null values
- 14,414 (39.2%)
- Unique values
- 290 (0.8%)
- Mean ± Std
- 0.296 ± 0.0380
- Median ± IQR
- 0.304 ± 0.0330
- Min | Max
- 0.124 | 0.441
relative_humidity_2m_lyon
Float32- Null values
- 0 (0.0%)
- Unique values
- 89 (0.2%)
- Mean ± Std
- 68.6 ± 18.7
- Median ± IQR
- 71.0 ± 28.0
- Min | Max
- 12.0 | 100.
temperature_2m_marseille
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,276 (3.5%)
- Mean ± Std
- 17.5 ± 6.14
- Median ± IQR
- 17.1 ± 9.71
- Min | Max
- 0.317 | 36.6
precipitation_marseille
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 0.0511 ± 0.381
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 21.0
wind_speed_10m_marseille
Float32- Null values
- 0 (0.0%)
- Unique values
- 4,018 (10.9%)
- Mean ± Std
- 14.8 ± 10.8
- Median ± IQR
- 11.8 ± 12.3
- Min | Max
- 0.00 | 74.6
cloud_cover_marseille
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 46.5 ± 44.5
- Median ± IQR
- 31.0 ± 100.
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_marseille
Float32- Null values
- 14,480 (39.4%)
- Unique values
- 354 (1.0%)
- Mean ± Std
- 0.226 ± 0.0748
- Median ± IQR
- 0.223 ± 0.119
- Min | Max
- 0.100 | 0.459
relative_humidity_2m_marseille
Float32- Null values
- 0 (0.0%)
- Unique values
- 86 (0.2%)
- Mean ± Std
- 63.4 ± 13.2
- Median ± IQR
- 64.0 ± 19.0
- Min | Max
- 14.0 | 99.0
temperature_2m_toulouse
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,513 (4.1%)
- Mean ± Std
- 15.2 ± 7.48
- Median ± IQR
- 14.6 ± 10.5
- Min | Max
- -5.33 | 41.2
precipitation_toulouse
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 121 (0.3%)
- Mean ± Std
- 0.0737 ± 0.586
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 36.9
wind_speed_10m_toulouse
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,123 (5.8%)
- Mean ± Std
- 9.88 ± 6.48
- Median ± IQR
- 8.65 ± 8.61
- Min | Max
- 0.00 | 44.6
cloud_cover_toulouse
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 62.1 ± 42.2
- Median ± IQR
- 86.0 ± 91.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_toulouse
Float32- Null values
- 14,414 (39.2%)
- Unique values
- 310 (0.8%)
- Mean ± Std
- 0.271 ± 0.0505
- Median ± IQR
- 0.285 ± 0.0540
- Min | Max
- 0.104 | 0.454
relative_humidity_2m_toulouse
Float32- Null values
- 0 (0.0%)
- Unique values
- 93 (0.3%)
- Mean ± Std
- 69.7 ± 18.7
- Median ± IQR
- 73.0 ± 29.0
- Min | Max
- 8.00 | 100.
temperature_2m_lille
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,081 (5.7%)
- Mean ± Std
- 12.2 ± 6.58
- Median ± IQR
- 11.8 ± 9.05
- Min | Max
- -6.32 | 40.8
precipitation_lille
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 75 (0.2%)
- Mean ± Std
- 0.0974 ± 0.417
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 14.7
wind_speed_10m_lille
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,536 (6.9%)
- Mean ± Std
- 12.9 ± 6.60
- Median ± IQR
- 11.7 ± 8.47
- Min | Max
- 0.00 | 61.9
cloud_cover_lille
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 67.5 ± 40.4
- Median ± IQR
- 96.0 ± 78.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_lille
Float32- Null values
- 14,414 (39.2%)
- Unique values
- 209 (0.6%)
- Mean ± Std
- 0.306 ± 0.0315
- Median ± IQR
- 0.311 ± 0.0400
- Min | Max
- 0.203 | 0.422
relative_humidity_2m_lille
Float32- Null values
- 0 (0.0%)
- Unique values
- 96 (0.3%)
- Mean ± Std
- 74.6 ± 17.1
- Median ± IQR
- 79.0 ± 24.0
- Min | Max
- 0.00 | 100.
temperature_2m_limoges
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,572 (4.3%)
- Mean ± Std
- 12.7 ± 7.35
- Median ± IQR
- 12.1 ± 9.80
- Min | Max
- -7.70 | 39.7
precipitation_limoges
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 153 (0.4%)
- Mean ± Std
- 0.122 ± 0.621
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 45.5
wind_speed_10m_limoges
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,359 (3.7%)
- Mean ± Std
- 7.57 ± 4.77
- Median ± IQR
- 6.49 ± 6.90
- Min | Max
- 0.00 | 33.9
cloud_cover_limoges
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 66.4 ± 40.8
- Median ± IQR
- 93.0 ± 81.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_limoges
Float32- Null values
- 14,414 (39.2%)
- Unique values
- 302 (0.8%)
- Mean ± Std
- 0.282 ± 0.0554
- Median ± IQR
- 0.298 ± 0.0650
- Min | Max
- 0.115 | 0.450
relative_humidity_2m_limoges
Float32- Null values
- 0 (0.0%)
- Unique values
- 93 (0.3%)
- Mean ± Std
- 75.2 ± 19.9
- Median ± IQR
- 81.0 ± 29.0
- Min | Max
- 8.00 | 100.
temperature_2m_nantes
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,539 (4.2%)
- Mean ± Std
- 13.8 ± 6.65
- Median ± IQR
- 13.3 ± 8.51
- Min | Max
- -3.86 | 43.4
precipitation_nantes
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 112 (0.3%)
- Mean ± Std
- 0.0866 ± 0.436
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 14.1
wind_speed_10m_nantes
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,834 (7.7%)
- Mean ± Std
- 13.4 ± 6.91
- Median ± IQR
- 12.0 ± 8.37
- Min | Max
- 0.00 | 58.6
cloud_cover_nantes
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 65.0 ± 41.3
- Median ± IQR
- 93.0 ± 84.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_nantes
Float32- Null values
- 14,480 (39.4%)
- Unique values
- 314 (0.9%)
- Mean ± Std
- 0.276 ± 0.0658
- Median ± IQR
- 0.295 ± 0.0840
- Min | Max
- 0.110 | 0.423
relative_humidity_2m_nantes
Float32- Null values
- 0 (0.0%)
- Unique values
- 94 (0.3%)
- Mean ± Std
- 74.0 ± 17.3
- Median ± IQR
- 78.0 ± 25.0
- Min | Max
- 7.00 | 100.
temperature_2m_strasbourg
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,525 (4.2%)
- Mean ± Std
- 12.7 ± 7.74
- Median ± IQR
- 12.3 ± 11.0
- Min | Max
- -9.31 | 38.8
precipitation_strasbourg
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 128 (0.3%)
- Mean ± Std
- 0.102 ± 0.516
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 22.1
wind_speed_10m_strasbourg
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,523 (4.1%)
- Mean ± Std
- 8.45 ± 5.05
- Median ± IQR
- 7.52 ± 6.92
- Min | Max
- 0.00 | 38.1
cloud_cover_strasbourg
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 69.7 ± 40.2
- Median ± IQR
- 98.0 ± 72.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_strasbourg
Float32- Null values
- 14,414 (39.2%)
- Unique values
- 304 (0.8%)
- Mean ± Std
- 0.329 ± 0.0519
- Median ± IQR
- 0.343 ± 0.0530
- Min | Max
- 0.159 | 0.468
relative_humidity_2m_strasbourg
Float32- Null values
- 0 (0.0%)
- Unique values
- 88 (0.2%)
- Mean ± Std
- 71.9 ± 18.4
- Median ± IQR
- 75.0 ± 28.0
- Min | Max
- 13.0 | 100.
temperature_2m_brest
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,265 (3.4%)
- Mean ± Std
- 12.9 ± 4.89
- Median ± IQR
- 12.6 ± 6.25
- Min | Max
- -2.33 | 40.5
precipitation_brest
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 108 (0.3%)
- Mean ± Std
- 0.106 ± 0.431
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 12.7
wind_speed_10m_brest
Float32- Null values
- 0 (0.0%)
- Unique values
- 3,782 (10.3%)
- Mean ± Std
- 16.2 ± 8.89
- Median ± IQR
- 14.5 ± 11.8
- Min | Max
- 0.00 | 67.3
cloud_cover_brest
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 102 (0.3%)
- Mean ± Std
- 67.9 ± 39.8
- Median ± IQR
- 96.0 ± 75.0
- Min | Max
- 0.00 | 101.
soil_moisture_1_to_3cm_brest
Float32- Null values
- 14,480 (39.4%)
- Unique values
- 279 (0.8%)
- Mean ± Std
- 0.266 ± 0.0572
- Median ± IQR
- 0.277 ± 0.0740
- Min | Max
- 0.116 | 0.409
relative_humidity_2m_brest
Float32- Null values
- 0 (0.0%)
- Unique values
- 90 (0.2%)
- Mean ± Std
- 78.2 ± 13.9
- Median ± IQR
- 81.0 ± 20.0
- Min | Max
- 10.0 | 100.
temperature_2m_bayonne
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,554 (4.2%)
- Mean ± Std
- 15.0 ± 6.40
- Median ± IQR
- 14.9 ± 8.47
- Min | Max
- -3.32 | 42.4
precipitation_bayonne
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 131 (0.4%)
- Mean ± Std
- 0.144 ± 0.551
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 18.5
wind_speed_10m_bayonne
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,488 (6.8%)
- Mean ± Std
- 10.9 ± 6.71
- Median ± IQR
- 9.36 ± 8.07
- Min | Max
- 0.00 | 51.5
cloud_cover_bayonne
Float32- Null values
- 1 (< 0.1%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 66.3 ± 40.8
- Median ± IQR
- 94.0 ± 80.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_bayonne
Float32- Null values
- 14,480 (39.4%)
- Unique values
- 299 (0.8%)
- Mean ± Std
- 0.276 ± 0.0509
- Median ± IQR
- 0.283 ± 0.0470
- Min | Max
- 0.0970 | 0.414
relative_humidity_2m_bayonne
Float32- Null values
- 0 (0.0%)
- Unique values
- 91 (0.2%)
- Mean ± Std
- 76.2 ± 16.0
- Median ± IQR
- 79.0 ± 25.0
- Min | Max
- 9.00 | 100.
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36744 (100.0%) | 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00 | |||
| 1 | temperature_2m_paris | Float32 | 0 (0.0%) | 1439 (3.9%) | 13.6 | 6.99 | -5.13 | 13.2 | 40.6 |
| 2 | precipitation_paris | Float32 | 1 (< 0.1%) | 135 (0.4%) | 0.0911 | 0.541 | 0.00 | 0.00 | 29.5 |
| 3 | wind_speed_10m_paris | Float32 | 0 (0.0%) | 1777 (4.8%) | 10.0 | 5.28 | 0.00 | 9.29 | 50.1 |
| 4 | cloud_cover_paris | Float32 | 1 (< 0.1%) | 103 (0.3%) | 69.2 | 39.9 | -1.00 | 97.0 | 101. |
| 5 | soil_moisture_1_to_3cm_paris | Float32 | 14414 (39.2%) | 277 (0.8%) | 0.298 | 0.0397 | 0.139 | 0.304 | 0.436 |
| 6 | relative_humidity_2m_paris | Float32 | 0 (0.0%) | 91 (0.2%) | 69.7 | 18.1 | 10.0 | 73.0 | 100. |
| 7 | temperature_2m_lyon | Float32 | 0 (0.0%) | 1565 (4.3%) | 14.1 | 7.96 | -5.89 | 13.8 | 40.3 |
| 8 | precipitation_lyon | Float32 | 1 (< 0.1%) | 150 (0.4%) | 0.0989 | 0.608 | 0.00 | 0.00 | 26.3 |
| 9 | wind_speed_10m_lyon | Float32 | 0 (0.0%) | 1729 (4.7%) | 8.07 | 6.04 | 0.00 | 6.48 | 43.2 |
| 10 | cloud_cover_lyon | Float32 | 1 (< 0.1%) | 103 (0.3%) | 64.4 | 41.8 | -1.00 | 92.0 | 101. |
| 11 | soil_moisture_1_to_3cm_lyon | Float32 | 14414 (39.2%) | 290 (0.8%) | 0.296 | 0.0380 | 0.124 | 0.304 | 0.441 |
| 12 | relative_humidity_2m_lyon | Float32 | 0 (0.0%) | 89 (0.2%) | 68.6 | 18.7 | 12.0 | 71.0 | 100. |
| 13 | temperature_2m_marseille | Float32 | 0 (0.0%) | 1276 (3.5%) | 17.5 | 6.14 | 0.317 | 17.1 | 36.6 |
| 14 | precipitation_marseille | Float32 | 1 (< 0.1%) | 103 (0.3%) | 0.0511 | 0.381 | 0.00 | 0.00 | 21.0 |
| 15 | wind_speed_10m_marseille | Float32 | 0 (0.0%) | 4018 (10.9%) | 14.8 | 10.8 | 0.00 | 11.8 | 74.6 |
| 16 | cloud_cover_marseille | Float32 | 1 (< 0.1%) | 103 (0.3%) | 46.5 | 44.5 | -1.00 | 31.0 | 101. |
| 17 | soil_moisture_1_to_3cm_marseille | Float32 | 14480 (39.4%) | 354 (1.0%) | 0.226 | 0.0748 | 0.100 | 0.223 | 0.459 |
| 18 | relative_humidity_2m_marseille | Float32 | 0 (0.0%) | 86 (0.2%) | 63.4 | 13.2 | 14.0 | 64.0 | 99.0 |
| 19 | temperature_2m_toulouse | Float32 | 0 (0.0%) | 1513 (4.1%) | 15.2 | 7.48 | -5.33 | 14.6 | 41.2 |
| 20 | precipitation_toulouse | Float32 | 1 (< 0.1%) | 121 (0.3%) | 0.0737 | 0.586 | 0.00 | 0.00 | 36.9 |
| 21 | wind_speed_10m_toulouse | Float32 | 0 (0.0%) | 2123 (5.8%) | 9.88 | 6.48 | 0.00 | 8.65 | 44.6 |
| 22 | cloud_cover_toulouse | Float32 | 1 (< 0.1%) | 103 (0.3%) | 62.1 | 42.2 | -1.00 | 86.0 | 101. |
| 23 | soil_moisture_1_to_3cm_toulouse | Float32 | 14414 (39.2%) | 310 (0.8%) | 0.271 | 0.0505 | 0.104 | 0.285 | 0.454 |
| 24 | relative_humidity_2m_toulouse | Float32 | 0 (0.0%) | 93 (0.3%) | 69.7 | 18.7 | 8.00 | 73.0 | 100. |
| 25 | temperature_2m_lille | Float32 | 0 (0.0%) | 2081 (5.7%) | 12.2 | 6.58 | -6.32 | 11.8 | 40.8 |
| 26 | precipitation_lille | Float32 | 1 (< 0.1%) | 75 (0.2%) | 0.0974 | 0.417 | 0.00 | 0.00 | 14.7 |
| 27 | wind_speed_10m_lille | Float32 | 0 (0.0%) | 2536 (6.9%) | 12.9 | 6.60 | 0.00 | 11.7 | 61.9 |
| 28 | cloud_cover_lille | Float32 | 1 (< 0.1%) | 103 (0.3%) | 67.5 | 40.4 | -1.00 | 96.0 | 101. |
| 29 | soil_moisture_1_to_3cm_lille | Float32 | 14414 (39.2%) | 209 (0.6%) | 0.306 | 0.0315 | 0.203 | 0.311 | 0.422 |
| 30 | relative_humidity_2m_lille | Float32 | 0 (0.0%) | 96 (0.3%) | 74.6 | 17.1 | 0.00 | 79.0 | 100. |
| 31 | temperature_2m_limoges | Float32 | 0 (0.0%) | 1572 (4.3%) | 12.7 | 7.35 | -7.70 | 12.1 | 39.7 |
| 32 | precipitation_limoges | Float32 | 1 (< 0.1%) | 153 (0.4%) | 0.122 | 0.621 | 0.00 | 0.00 | 45.5 |
| 33 | wind_speed_10m_limoges | Float32 | 0 (0.0%) | 1359 (3.7%) | 7.57 | 4.77 | 0.00 | 6.49 | 33.9 |
| 34 | cloud_cover_limoges | Float32 | 1 (< 0.1%) | 103 (0.3%) | 66.4 | 40.8 | -1.00 | 93.0 | 101. |
| 35 | soil_moisture_1_to_3cm_limoges | Float32 | 14414 (39.2%) | 302 (0.8%) | 0.282 | 0.0554 | 0.115 | 0.298 | 0.450 |
| 36 | relative_humidity_2m_limoges | Float32 | 0 (0.0%) | 93 (0.3%) | 75.2 | 19.9 | 8.00 | 81.0 | 100. |
| 37 | temperature_2m_nantes | Float32 | 0 (0.0%) | 1539 (4.2%) | 13.8 | 6.65 | -3.86 | 13.3 | 43.4 |
| 38 | precipitation_nantes | Float32 | 1 (< 0.1%) | 112 (0.3%) | 0.0866 | 0.436 | 0.00 | 0.00 | 14.1 |
| 39 | wind_speed_10m_nantes | Float32 | 0 (0.0%) | 2834 (7.7%) | 13.4 | 6.91 | 0.00 | 12.0 | 58.6 |
| 40 | cloud_cover_nantes | Float32 | 1 (< 0.1%) | 103 (0.3%) | 65.0 | 41.3 | -1.00 | 93.0 | 101. |
| 41 | soil_moisture_1_to_3cm_nantes | Float32 | 14480 (39.4%) | 314 (0.9%) | 0.276 | 0.0658 | 0.110 | 0.295 | 0.423 |
| 42 | relative_humidity_2m_nantes | Float32 | 0 (0.0%) | 94 (0.3%) | 74.0 | 17.3 | 7.00 | 78.0 | 100. |
| 43 | temperature_2m_strasbourg | Float32 | 0 (0.0%) | 1525 (4.2%) | 12.7 | 7.74 | -9.31 | 12.3 | 38.8 |
| 44 | precipitation_strasbourg | Float32 | 1 (< 0.1%) | 128 (0.3%) | 0.102 | 0.516 | 0.00 | 0.00 | 22.1 |
| 45 | wind_speed_10m_strasbourg | Float32 | 0 (0.0%) | 1523 (4.1%) | 8.45 | 5.05 | 0.00 | 7.52 | 38.1 |
| 46 | cloud_cover_strasbourg | Float32 | 1 (< 0.1%) | 103 (0.3%) | 69.7 | 40.2 | -1.00 | 98.0 | 101. |
| 47 | soil_moisture_1_to_3cm_strasbourg | Float32 | 14414 (39.2%) | 304 (0.8%) | 0.329 | 0.0519 | 0.159 | 0.343 | 0.468 |
| 48 | relative_humidity_2m_strasbourg | Float32 | 0 (0.0%) | 88 (0.2%) | 71.9 | 18.4 | 13.0 | 75.0 | 100. |
| 49 | temperature_2m_brest | Float32 | 0 (0.0%) | 1265 (3.4%) | 12.9 | 4.89 | -2.33 | 12.6 | 40.5 |
| 50 | precipitation_brest | Float32 | 1 (< 0.1%) | 108 (0.3%) | 0.106 | 0.431 | 0.00 | 0.00 | 12.7 |
| 51 | wind_speed_10m_brest | Float32 | 0 (0.0%) | 3782 (10.3%) | 16.2 | 8.89 | 0.00 | 14.5 | 67.3 |
| 52 | cloud_cover_brest | Float32 | 1 (< 0.1%) | 102 (0.3%) | 67.9 | 39.8 | 0.00 | 96.0 | 101. |
| 53 | soil_moisture_1_to_3cm_brest | Float32 | 14480 (39.4%) | 279 (0.8%) | 0.266 | 0.0572 | 0.116 | 0.277 | 0.409 |
| 54 | relative_humidity_2m_brest | Float32 | 0 (0.0%) | 90 (0.2%) | 78.2 | 13.9 | 10.0 | 81.0 | 100. |
| 55 | temperature_2m_bayonne | Float32 | 0 (0.0%) | 1554 (4.2%) | 15.0 | 6.40 | -3.32 | 14.9 | 42.4 |
| 56 | precipitation_bayonne | Float32 | 1 (< 0.1%) | 131 (0.4%) | 0.144 | 0.551 | 0.00 | 0.00 | 18.5 |
| 57 | wind_speed_10m_bayonne | Float32 | 0 (0.0%) | 2488 (6.8%) | 10.9 | 6.71 | 0.00 | 9.36 | 51.5 |
| 58 | cloud_cover_bayonne | Float32 | 1 (< 0.1%) | 103 (0.3%) | 66.3 | 40.8 | -1.00 | 94.0 | 101. |
| 59 | soil_moisture_1_to_3cm_bayonne | Float32 | 14480 (39.4%) | 299 (0.8%) | 0.276 | 0.0509 | 0.0970 | 0.283 | 0.414 |
| 60 | relative_humidity_2m_bayonne | Float32 | 0 (0.0%) | 91 (0.2%) | 76.2 | 16.0 | 9.00 | 79.0 | 100. |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Calendar and holidays features#
We leverage the holidays package to enrich the time range with some
calendar features such as public holidays in France. We also add some
features that are useful for time series forecasting such as the day of the
week, the day of the year, and the hour of the day.
Note that the holidays package requires us to extract the date for the
French timezone.
Similarly for the calendar features: all the time features are extracted from the time in the French timezone.
holidays_fr = holidays.country_holidays("FR", years=range(2019, 2026))
fr_time = pl.col("time").dt.convert_time_zone("Europe/Paris")
calendar = time.with_columns(
[
fr_time.dt.date().is_in(holidays_fr.keys()).alias("is_holiday_fr"),
fr_time.dt.weekday().alias("day_of_week_fr"),
fr_time.dt.ordinal_day().alias("day_of_year_fr"),
fr_time.dt.hour().alias("hour_of_day_fr"),
],
)
calendar
Show graph
| time | is_holiday_fr | day_of_week_fr | day_of_year_fr | hour_of_day_fr |
|---|---|---|---|---|
| 2021-03-23 00:00:00+00:00 | False | 2 | 82 | 1 |
| 2021-03-23 01:00:00+00:00 | False | 2 | 82 | 2 |
| 2021-03-23 02:00:00+00:00 | False | 2 | 82 | 3 |
| 2021-03-23 03:00:00+00:00 | False | 2 | 82 | 4 |
| 2021-03-23 04:00:00+00:00 | False | 2 | 82 | 5 |
| 2025-05-31 19:00:00+00:00 | False | 6 | 151 | 21 |
| 2025-05-31 20:00:00+00:00 | False | 6 | 151 | 22 |
| 2025-05-31 21:00:00+00:00 | False | 6 | 151 | 23 |
| 2025-05-31 22:00:00+00:00 | False | 7 | 152 | 0 |
| 2025-05-31 23:00:00+00:00 | False | 7 | 152 | 1 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
is_holiday_fr
Boolean- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
day_of_week_fr
Int8- Null values
- 0 (0.0%)
- Unique values
- 7 (< 0.1%)
- Mean ± Std
- 4.00 ± 2.00
- Median ± IQR
- 4.00 ± 4.00
- Min | Max
- 1.00 | 7.00
day_of_year_fr
Int16- Null values
- 0 (0.0%)
- Unique values
- 366 (1.0%)
- Mean ± Std
- 180. ± 104.
- Median ± IQR
- 174. ± 177.
- Min | Max
- 1.00 | 366.
hour_of_day_fr
Int8- Null values
- 0 (0.0%)
- Unique values
- 24 (< 0.1%)
- Mean ± Std
- 11.5 ± 6.92
- Median ± IQR
- 12.0 ± 11.0
- Min | Max
- 0.00 | 23.0
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36744 (100.0%) | 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00 | |||
| 1 | is_holiday_fr | Boolean | 0 (0.0%) | 2 (< 0.1%) | |||||
| 2 | day_of_week_fr | Int8 | 0 (0.0%) | 7 (< 0.1%) | 4.00 | 2.00 | 1.00 | 4.00 | 7.00 |
| 3 | day_of_year_fr | Int16 | 0 (0.0%) | 366 (1.0%) | 180. | 104. | 1.00 | 174. | 366. |
| 4 | hour_of_day_fr | Int8 | 0 (0.0%) | 24 (< 0.1%) | 11.5 | 6.92 | 0.00 | 12.0 | 23.0 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Electricity load data#
Finally we load the electricity load data. This data will both be used as a target variable but also to craft some lagged and window-aggregated features.
load_data_files = [
data_file
for data_file in sorted(data_source_folder.iterdir())
if data_file.name.startswith("Total Load - Day Ahead")
and data_file.name.endswith(".csv")
]
electricity_raw = skrub.var(
"electricity_raw",
pl.concat(
[
pl.from_pandas(pd.read_csv(data_file, na_values=["N/A", "-"])).drop(
["Day-ahead Total Load Forecast [MW] - BZN|FR"]
)
for data_file in load_data_files
],
how="vertical",
),
)
electricity_raw
Show graph
| Time (UTC) | Actual Total Load [MW] - BZN|FR |
|---|---|
| 01.01.2021 00:00 - 01.01.2021 01:00 | 64139.0 |
| 01.01.2021 01:00 - 01.01.2021 02:00 | 62657.0 |
| 01.01.2021 02:00 - 01.01.2021 03:00 | 59481.0 |
| 01.01.2021 03:00 - 01.01.2021 04:00 | 57656.0 |
| 01.01.2021 04:00 - 01.01.2021 05:00 | 57640.0 |
| 31.12.2025 22:45 - 31.12.2025 23:00 | |
| 31.12.2025 23:00 - 31.12.2025 23:15 | |
| 31.12.2025 23:15 - 31.12.2025 23:30 | |
| 31.12.2025 23:30 - 31.12.2025 23:45 | |
| 31.12.2025 23:45 - 01.01.2026 00:00 |
Time (UTC)
String- Null values
- 0 (0.0%)
- Unique values
- 70,176 (100.0%)
Actual Total Load [MW] - BZN|FR
Float64- Null values
- 18,032 (25.7%)
- Unique values
- 29,337 (41.8%)
- Mean ± Std
- 5.08e+04 ± 1.11e+04
- Median ± IQR
- 4.88e+04 ± 1.56e+04
- Min | Max
- 2.87e+04 | 8.82e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | Time (UTC) | String | 0 (0.0%) | 70176 (100.0%) | |||||
| 1 | Actual Total Load [MW] - BZN|FR | Float64 | 18032 (25.7%) | 29337 (41.8%) | 5.08e+04 | 1.11e+04 | 2.87e+04 | 4.88e+04 | 8.82e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
electricity = (
electricity_raw.with_columns(
[
pl.col("Time (UTC)")
.str.split(by=" - ")
.list.first()
.str.to_datetime("%d.%m.%Y %H:%M", time_zone="UTC")
.alias("time"),
]
)
.drop(["Time (UTC)"])
.rename({"Actual Total Load [MW] - BZN|FR": "load_mw"})
.filter(pl.col("time").dt.minute().eq(0))
.filter(pl.col("time") >= time_range_start)
.filter(pl.col("time") <= time_range_end)
.select(["time", "load_mw"])
)
electricity
Show graph
| time | load_mw |
|---|---|
| 2021-03-23 00:00:00+00:00 | 59823.0 |
| 2021-03-23 01:00:00+00:00 | 59369.0 |
| 2021-03-23 02:00:00+00:00 | 57550.0 |
| 2021-03-23 03:00:00+00:00 | 57188.0 |
| 2021-03-23 04:00:00+00:00 | 60367.0 |
| 2025-05-31 19:00:00+00:00 | 39069.0 |
| 2025-05-31 20:00:00+00:00 | 40387.0 |
| 2025-05-31 21:00:00+00:00 | 41174.0 |
| 2025-05-31 22:00:00+00:00 | 39664.0 |
| 2025-05-31 23:00:00+00:00 | 36067.0 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
load_mw
Float64- Null values
- 36 (< 0.1%)
- Unique values
- 23,318 (63.5%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36744 (100.0%) | 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00 | |||
| 1 | load_mw | Float64 | 36 (< 0.1%) | 23318 (63.5%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
electricity.filter(pl.col("load_mw").is_null())
Show graph
| time | load_mw |
|---|---|
| 2021-05-12 08:00:00+00:00 | |
| 2021-05-19 04:00:00+00:00 | |
| 2021-06-03 16:00:00+00:00 | |
| 2021-10-31 00:00:00+00:00 | |
| 2021-10-31 01:00:00+00:00 | |
| 2023-03-26 00:00:00+00:00 | |
| 2023-04-17 12:00:00+00:00 | |
| 2023-04-17 13:00:00+00:00 | |
| 2024-12-31 23:00:00+00:00 | |
| 2025-03-30 02:00:00+00:00 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36 (100.0%)
- Min | Max
- 2021-05-12T08:00:00+00:00 | 2025-03-30T02:00:00+00:00
load_mw
Float64- Null values
- 36 (100.0%)
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36 (100.0%) | 2021-05-12T08:00:00+00:00 | 2025-03-30T02:00:00+00:00 | |||
| 1 | load_mw | Float64 | 36 (100.0%) |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
electricity.filter(
(pl.col("time") > pl.datetime(2021, 10, 30, hour=10, time_zone="UTC"))
& (pl.col("time") < pl.datetime(2021, 10, 31, hour=10, time_zone="UTC"))
).skb.eval().plot.line(x="time:T", y="load_mw:Q")
electricity = electricity.with_columns([pl.col("load_mw").interpolate()])
electricity.filter(
(pl.col("time") > pl.datetime(2021, 10, 30, hour=10, time_zone="UTC"))
& (pl.col("time") < pl.datetime(2021, 10, 31, hour=10, time_zone="UTC"))
).skb.eval().plot.line(x="time:T", y="load_mw:Q")
Check that the number of rows matches our expectations based on the number of hours that separate the first and the last dates. We can do that by joining with the time range dataframe and checking that the number of rows stays the same.
assert (
time.join(electricity, on="time", how="inner").shape[0] == time.shape[0]
).skb.eval()
Lagged features#
We can now create some lagged features from the electricity load data.
We will create 3 hourly lagged features, 1 daily lagged feature, and 1 weekly lagged feature. We will also create a rolling median and inter-quartile feature over the last 24 hours and over the last 7 days.
def iqr(col, *, window_size: int):
"""Inter-quartile range (IQR) of a column."""
return col.rolling_quantile(0.75, window_size=window_size) - col.rolling_quantile(
0.25, window_size=window_size
)
electricity_lagged = electricity.with_columns(
[pl.col("load_mw").shift(i).alias(f"load_mw_lag_{i}h") for i in range(1, 4)]
+ [
pl.col("load_mw").shift(24).alias("load_mw_lag_1d"),
pl.col("load_mw").shift(24 * 7).alias("load_mw_lag_1w"),
pl.col("load_mw")
.rolling_median(window_size=24)
.alias("load_mw_rolling_median_24h"),
pl.col("load_mw")
.rolling_median(window_size=24 * 7)
.alias("load_mw_rolling_median_7d"),
iqr(pl.col("load_mw"), window_size=24).alias("load_mw_iqr_24h"),
iqr(pl.col("load_mw"), window_size=24 * 7).alias("load_mw_iqr_7d"),
],
)
electricity_lagged
Show graph
| time | load_mw | load_mw_lag_1h | load_mw_lag_2h | load_mw_lag_3h | load_mw_lag_1d | load_mw_lag_1w | load_mw_rolling_median_24h | load_mw_rolling_median_7d | load_mw_iqr_24h | load_mw_iqr_7d |
|---|---|---|---|---|---|---|---|---|---|---|
| 2021-03-23 00:00:00+00:00 | 59823.0 | |||||||||
| 2021-03-23 01:00:00+00:00 | 59369.0 | 59823.0 | ||||||||
| 2021-03-23 02:00:00+00:00 | 57550.0 | 59369.0 | 59823.0 | |||||||
| 2021-03-23 03:00:00+00:00 | 57188.0 | 57550.0 | 59369.0 | 59823.0 | ||||||
| 2021-03-23 04:00:00+00:00 | 60367.0 | 57188.0 | 57550.0 | 59369.0 | ||||||
| 2025-05-31 19:00:00+00:00 | 39069.0 | 39980.0 | 40890.0 | 40175.0 | 41584.0 | 39144.0 | 39356.0 | 40659.0 | 4231.0 | 7238.0 |
| 2025-05-31 20:00:00+00:00 | 40387.0 | 39069.0 | 39980.0 | 40890.0 | 42931.0 | 40286.0 | 39356.0 | 40659.0 | 4159.0 | 7238.0 |
| 2025-05-31 21:00:00+00:00 | 41174.0 | 40387.0 | 39069.0 | 39980.0 | 43812.0 | 41468.0 | 39356.0 | 40659.0 | 4159.0 | 7238.0 |
| 2025-05-31 22:00:00+00:00 | 39664.0 | 41174.0 | 40387.0 | 39069.0 | 41966.0 | 40346.0 | 39356.0 | 40659.0 | 4140.0 | 7238.0 |
| 2025-05-31 23:00:00+00:00 | 36067.0 | 39664.0 | 41174.0 | 40387.0 | 38248.0 | 37076.0 | 39356.0 | 40659.0 | 4823.0 | 7239.0 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
load_mw
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1h
Float64- Null values
- 1 (< 0.1%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_2h
Float64- Null values
- 2 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_3h
Float64- Null values
- 3 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1d
Float64- Null values
- 24 (< 0.1%)
- Unique values
- 23,342 (63.5%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1w
Float64- Null values
- 168 (0.5%)
- Unique values
- 23,293 (63.4%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.82e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_rolling_median_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 9,644 (26.2%)
- Mean ± Std
- 5.06e+04 ± 9.28e+03
- Median ± IQR
- 4.75e+04 ± 1.29e+04
- Min | Max
- 3.37e+04 | 7.84e+04
load_mw_rolling_median_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 7,138 (19.4%)
- Mean ± Std
- 5.01e+04 ± 8.82e+03
- Median ± IQR
- 4.60e+04 ± 1.35e+04
- Min | Max
- 3.85e+04 | 7.39e+04
load_mw_iqr_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 5,922 (16.1%)
- Mean ± Std
- 6.52e+03 ± 1.56e+03
- Median ± IQR
- 6.43e+03 ± 2.05e+03
- Min | Max
- 2.32e+03 | 1.60e+04
load_mw_iqr_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 5,327 (14.5%)
- Mean ± Std
- 8.30e+03 ± 1.41e+03
- Median ± IQR
- 8.27e+03 ± 1.63e+03
- Min | Max
- 5.04e+03 | 1.86e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36744 (100.0%) | 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00 | |||
| 1 | load_mw | Float64 | 0 (0.0%) | 23353 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 2 | load_mw_lag_1h | Float64 | 1 (< 0.1%) | 23353 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 3 | load_mw_lag_2h | Float64 | 2 (< 0.1%) | 23352 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 4 | load_mw_lag_3h | Float64 | 3 (< 0.1%) | 23352 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 5 | load_mw_lag_1d | Float64 | 24 (< 0.1%) | 23342 (63.5%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 6 | load_mw_lag_1w | Float64 | 168 (0.5%) | 23293 (63.4%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.82e+04 | 8.66e+04 |
| 7 | load_mw_rolling_median_24h | Float64 | 23 (< 0.1%) | 9644 (26.2%) | 5.06e+04 | 9.28e+03 | 3.37e+04 | 4.75e+04 | 7.84e+04 |
| 8 | load_mw_rolling_median_7d | Float64 | 167 (0.5%) | 7138 (19.4%) | 5.01e+04 | 8.82e+03 | 3.85e+04 | 4.60e+04 | 7.39e+04 |
| 9 | load_mw_iqr_24h | Float64 | 23 (< 0.1%) | 5922 (16.1%) | 6.52e+03 | 1.56e+03 | 2.32e+03 | 6.43e+03 | 1.60e+04 |
| 10 | load_mw_iqr_7d | Float64 | 167 (0.5%) | 5327 (14.5%) | 8.30e+03 | 1.41e+03 | 5.04e+03 | 8.27e+03 | 1.86e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
import altair
altair.Chart(electricity_lagged.tail(100).skb.eval()).transform_fold(
[
"load_mw",
"load_mw_lag_1h",
"load_mw_lag_2h",
"load_mw_lag_3h",
"load_mw_lag_1d",
"load_mw_lag_1w",
"load_mw_rolling_median_24h",
"load_mw_rolling_median_7d",
"load_mw_iqr_24h",
"load_mw_iqr_7d",
],
as_=["key", "load_mw"],
).mark_line(tooltip=True).encode(x="time:T", y="load_mw:Q", color="key:N").interactive()
Investigating outliers in the lagged features#
Let’s use the skrub.TableReport tool to look at the plots of the marginal
distribution of the lagged features.
from skrub import TableReport
TableReport(electricity_lagged.skb.eval())
Processing column 1 / 11
Processing column 2 / 11
Processing column 3 / 11
Processing column 4 / 11
Processing column 5 / 11
Processing column 6 / 11
Processing column 7 / 11
Processing column 8 / 11
Processing column 9 / 11
Processing column 10 / 11
Processing column 11 / 11
| time | load_mw | load_mw_lag_1h | load_mw_lag_2h | load_mw_lag_3h | load_mw_lag_1d | load_mw_lag_1w | load_mw_rolling_median_24h | load_mw_rolling_median_7d | load_mw_iqr_24h | load_mw_iqr_7d |
|---|---|---|---|---|---|---|---|---|---|---|
| 2021-03-23 00:00:00+00:00 | 59823.0 | |||||||||
| 2021-03-23 01:00:00+00:00 | 59369.0 | 59823.0 | ||||||||
| 2021-03-23 02:00:00+00:00 | 57550.0 | 59369.0 | 59823.0 | |||||||
| 2021-03-23 03:00:00+00:00 | 57188.0 | 57550.0 | 59369.0 | 59823.0 | ||||||
| 2021-03-23 04:00:00+00:00 | 60367.0 | 57188.0 | 57550.0 | 59369.0 | ||||||
| 2025-05-31 19:00:00+00:00 | 39069.0 | 39980.0 | 40890.0 | 40175.0 | 41584.0 | 39144.0 | 39356.0 | 40659.0 | 4231.0 | 7238.0 |
| 2025-05-31 20:00:00+00:00 | 40387.0 | 39069.0 | 39980.0 | 40890.0 | 42931.0 | 40286.0 | 39356.0 | 40659.0 | 4159.0 | 7238.0 |
| 2025-05-31 21:00:00+00:00 | 41174.0 | 40387.0 | 39069.0 | 39980.0 | 43812.0 | 41468.0 | 39356.0 | 40659.0 | 4159.0 | 7238.0 |
| 2025-05-31 22:00:00+00:00 | 39664.0 | 41174.0 | 40387.0 | 39069.0 | 41966.0 | 40346.0 | 39356.0 | 40659.0 | 4140.0 | 7238.0 |
| 2025-05-31 23:00:00+00:00 | 36067.0 | 39664.0 | 41174.0 | 40387.0 | 38248.0 | 37076.0 | 39356.0 | 40659.0 | 4823.0 | 7239.0 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
load_mw
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1h
Float64- Null values
- 1 (< 0.1%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_2h
Float64- Null values
- 2 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_3h
Float64- Null values
- 3 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1d
Float64- Null values
- 24 (< 0.1%)
- Unique values
- 23,342 (63.5%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1w
Float64- Null values
- 168 (0.5%)
- Unique values
- 23,293 (63.4%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.82e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_rolling_median_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 9,644 (26.2%)
- Mean ± Std
- 5.06e+04 ± 9.28e+03
- Median ± IQR
- 4.75e+04 ± 1.29e+04
- Min | Max
- 3.37e+04 | 7.84e+04
load_mw_rolling_median_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 7,138 (19.4%)
- Mean ± Std
- 5.01e+04 ± 8.82e+03
- Median ± IQR
- 4.60e+04 ± 1.35e+04
- Min | Max
- 3.85e+04 | 7.39e+04
load_mw_iqr_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 5,922 (16.1%)
- Mean ± Std
- 6.52e+03 ± 1.56e+03
- Median ± IQR
- 6.43e+03 ± 2.05e+03
- Min | Max
- 2.32e+03 | 1.60e+04
load_mw_iqr_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 5,327 (14.5%)
- Mean ± Std
- 8.30e+03 ± 1.41e+03
- Median ± IQR
- 8.27e+03 ± 1.63e+03
- Min | Max
- 5.04e+03 | 1.86e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36744 (100.0%) | 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00 | |||
| 1 | load_mw | Float64 | 0 (0.0%) | 23353 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 2 | load_mw_lag_1h | Float64 | 1 (< 0.1%) | 23353 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 3 | load_mw_lag_2h | Float64 | 2 (< 0.1%) | 23352 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 4 | load_mw_lag_3h | Float64 | 3 (< 0.1%) | 23352 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 5 | load_mw_lag_1d | Float64 | 24 (< 0.1%) | 23342 (63.5%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 6 | load_mw_lag_1w | Float64 | 168 (0.5%) | 23293 (63.4%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.82e+04 | 8.66e+04 |
| 7 | load_mw_rolling_median_24h | Float64 | 23 (< 0.1%) | 9644 (26.2%) | 5.06e+04 | 9.28e+03 | 3.37e+04 | 4.75e+04 | 7.84e+04 |
| 8 | load_mw_rolling_median_7d | Float64 | 167 (0.5%) | 7138 (19.4%) | 5.01e+04 | 8.82e+03 | 3.85e+04 | 4.60e+04 | 7.39e+04 |
| 9 | load_mw_iqr_24h | Float64 | 23 (< 0.1%) | 5922 (16.1%) | 6.52e+03 | 1.56e+03 | 2.32e+03 | 6.43e+03 | 1.60e+04 |
| 10 | load_mw_iqr_7d | Float64 | 167 (0.5%) | 5327 (14.5%) | 8.30e+03 | 1.41e+03 | 5.04e+03 | 8.27e+03 | 1.86e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
load_mw
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1h
Float64- Null values
- 1 (< 0.1%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_2h
Float64- Null values
- 2 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_3h
Float64- Null values
- 3 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1d
Float64- Null values
- 24 (< 0.1%)
- Unique values
- 23,342 (63.5%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1w
Float64- Null values
- 168 (0.5%)
- Unique values
- 23,293 (63.4%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.82e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_rolling_median_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 9,644 (26.2%)
- Mean ± Std
- 5.06e+04 ± 9.28e+03
- Median ± IQR
- 4.75e+04 ± 1.29e+04
- Min | Max
- 3.37e+04 | 7.84e+04
load_mw_rolling_median_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 7,138 (19.4%)
- Mean ± Std
- 5.01e+04 ± 8.82e+03
- Median ± IQR
- 4.60e+04 ± 1.35e+04
- Min | Max
- 3.85e+04 | 7.39e+04
load_mw_iqr_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 5,922 (16.1%)
- Mean ± Std
- 6.52e+03 ± 1.56e+03
- Median ± IQR
- 6.43e+03 ± 2.05e+03
- Min | Max
- 2.32e+03 | 1.60e+04
load_mw_iqr_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 5,327 (14.5%)
- Mean ± Std
- 8.30e+03 ± 1.41e+03
- Median ± IQR
- 8.27e+03 ± 1.63e+03
- Min | Max
- 5.04e+03 | 1.86e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column 1 | Column 2 | Cramér's V | Pearson's Correlation |
|---|---|---|---|
| load_mw_lag_2h | load_mw_lag_3h | 0.737 | 0.981 |
| load_mw | load_mw_lag_1h | 0.725 | 0.980 |
| load_mw_lag_1h | load_mw_lag_2h | 0.721 | 0.980 |
| load_mw_lag_1d | load_mw_rolling_median_24h | 0.615 | 0.883 |
| load_mw_lag_1w | load_mw_rolling_median_7d | 0.613 | 0.819 |
| load_mw | load_mw_lag_1d | 0.602 | 0.930 |
| load_mw | load_mw_lag_2h | 0.564 | 0.938 |
| load_mw_lag_1h | load_mw_lag_3h | 0.561 | 0.936 |
| load_mw_lag_1h | load_mw_lag_1d | 0.560 | 0.916 |
| load_mw_rolling_median_24h | load_mw_rolling_median_7d | 0.553 | 0.911 |
| load_mw | load_mw_lag_1w | 0.510 | 0.870 |
| load_mw_lag_2h | load_mw_lag_1d | 0.503 | 0.879 |
| load_mw_lag_3h | load_mw_rolling_median_24h | 0.493 | 0.892 |
| load_mw_lag_2h | load_mw_rolling_median_24h | 0.492 | 0.889 |
| load_mw_lag_1h | load_mw_rolling_median_24h | 0.489 | 0.884 |
| load_mw_rolling_median_7d | load_mw_iqr_7d | 0.488 | 0.192 |
| load_mw_lag_1d | load_mw_lag_1w | 0.488 | 0.833 |
| load_mw_lag_1h | load_mw_lag_1w | 0.483 | 0.854 |
| load_mw_lag_1d | load_mw_rolling_median_7d | 0.482 | 0.836 |
| load_mw | load_mw_rolling_median_24h | 0.481 | 0.879 |
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Let’s extract the dates where the inter-quartile range of the load is greater than 15,000 MW.
electricity_lagged.filter(pl.col("load_mw_iqr_7d") > 15_000)[
"time"
].dt.date().unique().sort().to_list().skb.eval()
[datetime.date(2021, 12, 26),
datetime.date(2021, 12, 27),
datetime.date(2021, 12, 28),
datetime.date(2022, 1, 7),
datetime.date(2022, 1, 8),
datetime.date(2023, 1, 19),
datetime.date(2023, 1, 20),
datetime.date(2023, 1, 21),
datetime.date(2024, 1, 10),
datetime.date(2024, 1, 11),
datetime.date(2024, 1, 12),
datetime.date(2024, 1, 13)]
We observe 3 date ranges with high inter-quartile range. Let’s plot the electricity load and the lagged features for the first data range along with the weather data for Paris.
altair.Chart(
electricity_lagged.filter(
(pl.col("time") > pl.datetime(2021, 12, 1, time_zone="UTC"))
& (pl.col("time") < pl.datetime(2021, 12, 31, time_zone="UTC"))
).skb.eval()
).transform_fold(
[
"load_mw",
"load_mw_iqr_7d",
],
).mark_line(
tooltip=True
).encode(
x="time:T", y="value:Q", color="key:N"
).interactive()
altair.Chart(
all_city_weather.filter(
(pl.col("time") > pl.datetime(2021, 12, 1, time_zone="UTC"))
& (pl.col("time") < pl.datetime(2021, 12, 31, time_zone="UTC"))
).skb.eval()
).transform_fold(
[f"temperature_2m_{city_name}" for city_name in city_names],
).mark_line(
tooltip=True
).encode(
x="time:T", y="value:Q", color="key:N"
).interactive()
Based on the plots above, we can see that the electricity load was high just before the Christmas holidays due to low temperatures. Then the load suddenly dropped because temperatures went higher right at the start of the end-of-year holidays.
So those outliers do not seem to be caused to a data quality issue but rather due to a real change in the electricity load demand. We could conduct similar analysis for the other date ranges with high inter-quartile range but we will skip that for now.
If we had observed significant data quality issues over extended periods of
time could have been addressed by removing the corresponding rows from the
dataset. However, this would make the lagged and windowing feature
engineering challenging to reimplement correctly. A better approach would be
to keep a contiguous dataset assign 0 weights to the affected rows when
fitting or evaluating the trained models via the use of the sample_weight
parameter.
Final dataset#
We now assemble the dataset that will be used to train and evaluate the forecasting models via backtesting.
prediction_time = time = skrub.var(
"prediction_time",
pl.DataFrame().with_columns(
pl.datetime_range(
start=time_range_start + pl.duration(days=7),
end=time_range_end - pl.duration(hours=24),
time_zone="UTC",
interval="1h",
).alias("prediction_time"),
),
)
prediction_time
Show graph
| prediction_time |
|---|
| 2021-03-30 00:00:00+00:00 |
| 2021-03-30 01:00:00+00:00 |
| 2021-03-30 02:00:00+00:00 |
| 2021-03-30 03:00:00+00:00 |
| 2021-03-30 04:00:00+00:00 |
| 2025-05-30 19:00:00+00:00 |
| 2025-05-30 20:00:00+00:00 |
| 2025-05-30 21:00:00+00:00 |
| 2025-05-30 22:00:00+00:00 |
| 2025-05-30 23:00:00+00:00 |
prediction_time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,552 (100.0%)
- Min | Max
- 2021-03-30T00:00:00+00:00 | 2025-05-30T23:00:00+00:00
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | prediction_time | Datetime | 0 (0.0%) | 36552 (100.0%) | 2021-03-30T00:00:00+00:00 | 2025-05-30T23:00:00+00:00 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
features = (
(
prediction_time.join(
electricity_lagged, left_on="prediction_time", right_on="time"
)
.join(all_city_weather, left_on="prediction_time", right_on="time")
.join(calendar, left_on="prediction_time", right_on="time")
)
.drop("prediction_time")
.skb.mark_as_X()
)
features
Show graph
| load_mw | load_mw_lag_1h | load_mw_lag_2h | load_mw_lag_3h | load_mw_lag_1d | load_mw_lag_1w | load_mw_rolling_median_24h | load_mw_rolling_median_7d | load_mw_iqr_24h | load_mw_iqr_7d | temperature_2m_paris | precipitation_paris | wind_speed_10m_paris | cloud_cover_paris | soil_moisture_1_to_3cm_paris | relative_humidity_2m_paris | temperature_2m_lyon | precipitation_lyon | wind_speed_10m_lyon | cloud_cover_lyon | soil_moisture_1_to_3cm_lyon | relative_humidity_2m_lyon | temperature_2m_marseille | precipitation_marseille | wind_speed_10m_marseille | cloud_cover_marseille | soil_moisture_1_to_3cm_marseille | relative_humidity_2m_marseille | temperature_2m_toulouse | precipitation_toulouse | wind_speed_10m_toulouse | cloud_cover_toulouse | soil_moisture_1_to_3cm_toulouse | relative_humidity_2m_toulouse | temperature_2m_lille | precipitation_lille | wind_speed_10m_lille | cloud_cover_lille | soil_moisture_1_to_3cm_lille | relative_humidity_2m_lille | temperature_2m_limoges | precipitation_limoges | wind_speed_10m_limoges | cloud_cover_limoges | soil_moisture_1_to_3cm_limoges | relative_humidity_2m_limoges | temperature_2m_nantes | precipitation_nantes | wind_speed_10m_nantes | cloud_cover_nantes | soil_moisture_1_to_3cm_nantes | relative_humidity_2m_nantes | temperature_2m_strasbourg | precipitation_strasbourg | wind_speed_10m_strasbourg | cloud_cover_strasbourg | soil_moisture_1_to_3cm_strasbourg | relative_humidity_2m_strasbourg | temperature_2m_brest | precipitation_brest | wind_speed_10m_brest | cloud_cover_brest | soil_moisture_1_to_3cm_brest | relative_humidity_2m_brest | temperature_2m_bayonne | precipitation_bayonne | wind_speed_10m_bayonne | cloud_cover_bayonne | soil_moisture_1_to_3cm_bayonne | relative_humidity_2m_bayonne | is_holiday_fr | day_of_week_fr | day_of_year_fr | hour_of_day_fr |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 46395.0 | 47401.0 | 49217.0 | 51561.0 | 48600.0 | 59823.0 | 51122.5 | 54884.5 | 7834.0 | 8199.0 | 13.56450080871582 | 0.0 | 6.287129878997803 | 95.0 | 63.0 | 11.085000038146973 | 0.0 | 4.735060214996338 | 9.0 | 65.0 | 14.479000091552734 | 0.0 | 1.4399999380111694 | 0.0 | 71.0 | 11.882500648498535 | 0.0 | 19.174856185913086 | 0.0 | 64.0 | 10.800000190734863 | 0.0 | 7.342587947845459 | 13.0 | 69.0 | 7.800999641418457 | 0.0 | 4.320000171661377 | 0.0 | 83.0 | 9.878000259399414 | 0.0 | 11.486200332641602 | 0.0 | 87.0 | 10.88700008392334 | 0.0 | 2.099142551422119 | 29.0 | 62.0 | 10.277999877929688 | 0.0 | 11.92355728149414 | 100.0 | 82.0 | 12.748000144958496 | 0.0 | 8.942214012145996 | 6.0 | 63.0 | False | 2 | 89 | 2 | ||||||||||
| 44269.0 | 46395.0 | 47401.0 | 49217.0 | 46722.0 | 59369.0 | 51122.5 | 54857.5 | 7834.0 | 8252.0 | 13.06450080871582 | 0.0 | 5.804825305938721 | 100.0 | 64.0 | 10.585000038146973 | 0.0 | 4.553679466247559 | 63.0 | 65.0 | 14.328999519348145 | 0.0 | 1.527350664138794 | 0.0 | 72.0 | 11.582500457763672 | 0.0 | 19.40663719177246 | 0.0 | 67.0 | 10.399999618530273 | 0.0 | 7.56856632232666 | 48.0 | 67.0 | 7.401000022888184 | 0.0 | 4.320000171661377 | 0.0 | 85.0 | 9.527999877929688 | 0.0 | 10.972620010375977 | 0.0 | 86.0 | 10.337000846862793 | 0.0 | 1.1384198665618896 | 17.0 | 62.0 | 10.32800006866455 | 0.0 | 13.532360076904297 | 11.0 | 83.0 | 12.29800033569336 | 0.0 | 9.693296432495117 | 10.0 | 64.0 | False | 2 | 89 | 3 | ||||||||||
| 43874.0 | 44269.0 | 46395.0 | 47401.0 | 46329.0 | 57550.0 | 51122.5 | 54603.0 | 7834.0 | 8269.0 | 12.614500045776367 | 0.0 | 5.804825305938721 | 100.0 | 65.0 | 10.135000228881836 | 0.0 | 5.091168403625488 | 100.0 | 65.0 | 14.178999900817871 | 0.0 | 1.8356469869613647 | 0.0 | 72.0 | 11.432499885559082 | 0.0 | 19.914215087890625 | 0.0 | 68.0 | 10.050000190734863 | 0.0 | 7.072877883911133 | 58.0 | 67.0 | 7.151000022888184 | 0.0 | 4.802999019622803 | 0.0 | 85.0 | 9.378000259399414 | 0.0 | 10.703569412231445 | 6.0 | 84.0 | 9.88700008392334 | 0.0 | 1.7999999523162842 | 42.0 | 63.0 | 10.428000450134277 | 0.0 | 14.489720344543457 | 17.0 | 82.0 | 11.89799976348877 | 0.0 | 9.793059349060059 | 6.0 | 63.0 | False | 2 | 89 | 4 | ||||||||||
| 46197.0 | 43874.0 | 44269.0 | 46395.0 | 49199.0 | 57188.0 | 51122.5 | 54325.0 | 8856.0 | 8278.0 | 12.214500427246094 | 0.0 | 6.119999885559082 | 100.0 | 66.0 | 9.6850004196167 | 0.0 | 5.411986351013184 | 100.0 | 65.0 | 14.029000282287598 | 0.0 | 2.5199999809265137 | 0.0 | 73.0 | 11.232500076293945 | 0.0 | 20.124610900878906 | 0.0 | 70.0 | 9.800000190734863 | 0.0 | 7.072877883911133 | 98.0 | 68.0 | 6.60099983215332 | 0.0 | 4.802999019622803 | 0.0 | 85.0 | 9.128000259399414 | 0.0 | 10.972620010375977 | 0.0 | 84.0 | 9.487000465393066 | 0.0 | 3.5999999046325684 | 18.0 | 63.0 | 10.628000259399414 | 0.0 | 14.986553192138672 | 100.0 | 78.0 | 11.697999954223633 | 0.0 | 9.511088371276855 | 10.0 | 61.0 | False | 2 | 89 | 5 | ||||||||||
| 51913.0 | 46197.0 | 43874.0 | 44269.0 | 54881.0 | 60367.0 | 51122.5 | 54140.0 | 8856.0 | 8278.0 | 11.764500617980957 | 0.0 | 5.315336227416992 | 57.0 | 68.0 | 9.135000228881836 | 0.0 | 5.399999618530273 | 61.0 | 66.0 | 13.979000091552734 | 0.0 | 3.617955207824707 | 5.0 | 72.0 | 10.982500076293945 | 0.0 | 20.150354385375977 | 0.0 | 71.0 | 9.550000190734863 | 0.0 | 6.989935398101807 | 100.0 | 71.0 | 6.200999736785889 | 0.0 | 4.452953815460205 | 0.0 | 83.0 | 8.928000450134277 | 0.0 | 11.18320083618164 | 6.0 | 85.0 | 8.937000274658203 | 0.0 | 4.33497428894043 | 19.0 | 65.0 | 10.428000450134277 | 0.0 | 15.46324634552002 | 14.0 | 77.0 | 11.39799976348877 | 0.0 | 8.714676856994629 | 13.0 | 61.0 | False | 2 | 89 | 6 | ||||||||||
| 41584.0 | 43226.0 | 44004.0 | 43246.0 | 40135.0 | 42773.0 | 41951.5 | 40323.5 | 6217.0 | 7202.0 | 29.114999771118164 | 0.0 | 8.427383422851562 | 11.0 | 0.2680000066757202 | 34.0 | 29.211000442504883 | 0.0 | 4.829906940460205 | 0.0 | 0.26899999380111694 | 29.0 | 20.767000198364258 | 0.0 | 9.585739135742188 | 0.0 | 0.14100000262260437 | 79.0 | 30.57000160217285 | 0.0 | 7.704335689544678 | 16.0 | 0.21199999749660492 | 32.0 | 22.8804988861084 | 0.0 | 9.359999656677246 | 0.0 | 0.2750000059604645 | 65.0 | 29.423500061035156 | 0.0 | 6.162207126617432 | 0.0 | 0.1679999977350235 | 33.0 | 26.937000274658203 | 0.0 | 7.928177833557129 | 0.0 | 0.18799999356269836 | 51.0 | 26.5935001373291 | 0.0 | 6.287129878997803 | 0.0 | 0.2639999985694885 | 49.0 | 17.61750030517578 | 0.0 | 11.874544143676758 | 11.0 | 0.16300000250339508 | 69.0 | 21.37649917602539 | 0.0 | 14.98222827911377 | 100.0 | 0.19499999284744263 | 72.0 | False | 5 | 150 | 21 |
| 42931.0 | 41584.0 | 43226.0 | 44004.0 | 41362.0 | 44204.0 | 42382.0 | 40323.5 | 6217.0 | 7196.0 | 27.915000915527344 | 0.0 | 2.1600000858306885 | 66.0 | 0.2680000066757202 | 41.0 | 27.56100082397461 | 0.0 | 0.8049845099449158 | 9.0 | 0.26899999380111694 | 41.0 | 19.567001342773438 | 0.0 | 7.9932966232299805 | 0.0 | 0.14100000262260437 | 86.0 | 27.770000457763672 | 0.0 | 5.86037540435791 | 48.0 | 0.21299999952316284 | 37.0 | 21.08049964904785 | 0.0 | 6.479999542236328 | 99.0 | 0.2750000059604645 | 72.0 | 26.923500061035156 | 0.0 | 3.096837043762207 | 61.0 | 0.16699999570846558 | 48.0 | 25.786998748779297 | 0.0 | 6.8777899742126465 | 0.0 | 0.18799999356269836 | 54.0 | 25.39349937438965 | 0.0 | 4.104631423950195 | 0.0 | 0.2639999985694885 | 53.0 | 16.567501068115234 | 0.0 | 9.957108497619629 | 0.0 | 0.16200000047683716 | 75.0 | 22.726499557495117 | 0.20000000298023224 | 9.422101020812988 | 100.0 | 0.19599999487400055 | 67.0 | False | 5 | 150 | 22 |
| 43812.0 | 42931.0 | 41584.0 | 43226.0 | 42722.0 | 45021.0 | 42382.0 | 40323.5 | 6288.0 | 7181.0 | 26.165000915527344 | 0.0 | 3.2599384784698486 | 48.0 | 0.26899999380111694 | 55.0 | 25.961000442504883 | 0.0 | 1.4843180179595947 | 38.0 | 0.27000001072883606 | 51.0 | 19.767000198364258 | 0.0 | 6.287129878997803 | 0.0 | 0.14000000059604645 | 82.0 | 25.82000160217285 | 0.0 | 4.072935104370117 | 89.0 | 0.2150000035762787 | 42.0 | 20.58049964904785 | 0.0 | 6.839999675750732 | 100.0 | 0.2750000059604645 | 74.0 | 24.37350082397461 | 0.0 | 2.9024126529693604 | 84.0 | 0.16599999368190765 | 66.0 | 24.886999130249023 | 0.0 | 4.829906940460205 | 100.0 | 0.18799999356269836 | 55.0 | 23.443500518798828 | 0.0 | 3.3190360069274902 | 0.0 | 0.2639999985694885 | 66.0 | 15.717499732971191 | 0.0 | 8.55710220336914 | 100.0 | 0.16200000047683716 | 81.0 | 21.026498794555664 | 0.0 | 4.0249223709106445 | 100.0 | 0.1979999989271164 | 75.0 | False | 5 | 150 | 23 |
| 41966.0 | 43812.0 | 42931.0 | 41584.0 | 41152.0 | 43402.0 | 42382.0 | 40323.5 | 6288.0 | 7181.0 | 22.614999771118164 | 0.0 | 11.090103149414062 | 0.0 | 0.27000001072883606 | 68.0 | 23.31100082397461 | 0.0 | 1.8356469869613647 | 0.0 | 0.2709999978542328 | 66.0 | 19.91699981689453 | 0.0 | 5.483356475830078 | 14.0 | 0.14000000059604645 | 85.0 | 24.470001220703125 | 0.0 | 4.213691711425781 | 61.0 | 0.2160000056028366 | 45.0 | 19.6304988861084 | 0.0 | 5.759999752044678 | 100.0 | 0.2750000059604645 | 74.0 | 22.773500442504883 | 0.0 | 1.7999999523162842 | 100.0 | 0.16699999570846558 | 71.0 | 24.187000274658203 | 0.0 | 8.404284477233887 | 17.0 | 0.18799999356269836 | 56.0 | 22.193500518798828 | 0.0 | 2.545584201812744 | 0.0 | 0.2639999985694885 | 73.0 | 15.217499732971191 | 0.0 | 8.89134407043457 | 3.0 | 0.16300000250339508 | 87.0 | 22.12649917602539 | 0.0 | 11.384199142456055 | 100.0 | 0.19900000095367432 | 67.0 | False | 6 | 151 | 0 |
| 38248.0 | 41966.0 | 43812.0 | 42931.0 | 37524.0 | 39496.0 | 42382.0 | 40323.5 | 5564.0 | 7181.0 | 21.065000534057617 | 0.0 | 7.771330833435059 | 0.0 | 0.27000001072883606 | 73.0 | 22.161001205444336 | 0.0 | 1.2979984283447266 | 0.0 | 0.2720000147819519 | 69.0 | 19.66699981689453 | 0.0 | 6.119999885559082 | 39.0 | 0.14000000059604645 | 85.0 | 23.470001220703125 | 0.0 | 3.7064266204833984 | 11.0 | 0.21799999475479126 | 47.0 | 19.030498504638672 | 0.0 | 6.839999675750732 | 100.0 | 0.2759999930858612 | 80.0 | 21.023500442504883 | 0.0 | 2.9024126529693604 | 100.0 | 0.16699999570846558 | 78.0 | 22.687000274658203 | 0.0 | 4.349896430969238 | 100.0 | 0.18799999356269836 | 63.0 | 20.943500518798828 | 0.0 | 3.219938039779663 | 0.0 | 0.26499998569488525 | 80.0 | 14.917499542236328 | 0.0 | 8.66994857788086 | 0.0 | 0.16300000250339508 | 87.0 | 21.426498413085938 | 0.0 | 11.96695327758789 | 100.0 | 0.20100000500679016 | 71.0 | False | 6 | 151 | 1 |
load_mw
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,274 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,274 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_2h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,274 (63.7%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_3h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,274 (63.7%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1d
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,275 (63.7%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1w
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,283 (63.7%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.82e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_rolling_median_24h
Float64- Null values
- 0 (0.0%)
- Unique values
- 9,599 (26.3%)
- Mean ± Std
- 5.05e+04 ± 9.29e+03
- Median ± IQR
- 4.74e+04 ± 1.29e+04
- Min | Max
- 3.37e+04 | 7.84e+04
load_mw_rolling_median_7d
Float64- Null values
- 0 (0.0%)
- Unique values
- 7,137 (19.5%)
- Mean ± Std
- 5.01e+04 ± 8.81e+03
- Median ± IQR
- 4.60e+04 ± 1.35e+04
- Min | Max
- 3.85e+04 | 7.39e+04
load_mw_iqr_24h
Float64- Null values
- 0 (0.0%)
- Unique values
- 5,908 (16.2%)
- Mean ± Std
- 6.52e+03 ± 1.56e+03
- Median ± IQR
- 6.43e+03 ± 2.05e+03
- Min | Max
- 2.32e+03 | 1.60e+04
load_mw_iqr_7d
Float64- Null values
- 0 (0.0%)
- Unique values
- 5,327 (14.6%)
- Mean ± Std
- 8.30e+03 ± 1.41e+03
- Median ± IQR
- 8.28e+03 ± 1.63e+03
- Min | Max
- 5.04e+03 | 1.86e+04
temperature_2m_paris
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,439 (3.9%)
- Mean ± Std
- 13.6 ± 7.00
- Median ± IQR
- 13.2 ± 9.77
- Min | Max
- -5.13 | 40.6
precipitation_paris
Float32- Null values
- 0 (0.0%)
- Unique values
- 135 (0.4%)
- Mean ± Std
- 0.0914 ± 0.543
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 29.5
wind_speed_10m_paris
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,774 (4.9%)
- Mean ± Std
- 10.0 ± 5.28
- Median ± IQR
- 9.29 ± 7.20
- Min | Max
- 0.00 | 50.1
cloud_cover_paris
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 69.2 ± 39.9
- Median ± IQR
- 97.0 ± 71.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_paris
Float32- Null values
- 14,246 (39.0%)
- Unique values
- 277 (0.8%)
- Mean ± Std
- 0.298 ± 0.0397
- Median ± IQR
- 0.304 ± 0.0440
- Min | Max
- 0.139 | 0.436
relative_humidity_2m_paris
Float32- Null values
- 0 (0.0%)
- Unique values
- 91 (0.2%)
- Mean ± Std
- 69.7 ± 18.1
- Median ± IQR
- 73.0 ± 27.0
- Min | Max
- 10.0 | 100.
temperature_2m_lyon
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,565 (4.3%)
- Mean ± Std
- 14.1 ± 7.97
- Median ± IQR
- 13.8 ± 11.4
- Min | Max
- -5.89 | 40.3
precipitation_lyon
Float32- Null values
- 0 (0.0%)
- Unique values
- 150 (0.4%)
- Mean ± Std
- 0.0993 ± 0.609
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 26.3
wind_speed_10m_lyon
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,727 (4.7%)
- Mean ± Std
- 8.08 ± 6.05
- Median ± IQR
- 6.48 ± 7.67
- Min | Max
- 0.00 | 43.2
cloud_cover_lyon
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 64.5 ± 41.8
- Median ± IQR
- 92.0 ± 88.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_lyon
Float32- Null values
- 14,246 (39.0%)
- Unique values
- 290 (0.8%)
- Mean ± Std
- 0.296 ± 0.0380
- Median ± IQR
- 0.304 ± 0.0320
- Min | Max
- 0.124 | 0.441
relative_humidity_2m_lyon
Float32- Null values
- 0 (0.0%)
- Unique values
- 89 (0.2%)
- Mean ± Std
- 68.6 ± 18.7
- Median ± IQR
- 71.0 ± 28.0
- Min | Max
- 12.0 | 100.
temperature_2m_marseille
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,276 (3.5%)
- Mean ± Std
- 17.5 ± 6.15
- Median ± IQR
- 17.1 ± 9.75
- Min | Max
- 0.317 | 36.6
precipitation_marseille
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 0.0514 ± 0.382
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 21.0
wind_speed_10m_marseille
Float32- Null values
- 0 (0.0%)
- Unique values
- 4,018 (11.0%)
- Mean ± Std
- 14.9 ± 10.8
- Median ± IQR
- 11.8 ± 12.3
- Min | Max
- 0.00 | 74.6
cloud_cover_marseille
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 46.6 ± 44.5
- Median ± IQR
- 31.0 ± 100.
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_marseille
Float32- Null values
- 14,312 (39.2%)
- Unique values
- 354 (1.0%)
- Mean ± Std
- 0.227 ± 0.0748
- Median ± IQR
- 0.223 ± 0.119
- Min | Max
- 0.100 | 0.459
relative_humidity_2m_marseille
Float32- Null values
- 0 (0.0%)
- Unique values
- 86 (0.2%)
- Mean ± Std
- 63.4 ± 13.2
- Median ± IQR
- 64.0 ± 19.0
- Min | Max
- 14.0 | 99.0
temperature_2m_toulouse
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,513 (4.1%)
- Mean ± Std
- 15.2 ± 7.48
- Median ± IQR
- 14.6 ± 10.5
- Min | Max
- -5.33 | 41.2
precipitation_toulouse
Float32- Null values
- 0 (0.0%)
- Unique values
- 121 (0.3%)
- Mean ± Std
- 0.0740 ± 0.587
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 36.9
wind_speed_10m_toulouse
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,123 (5.8%)
- Mean ± Std
- 9.88 ± 6.48
- Median ± IQR
- 8.65 ± 8.61
- Min | Max
- 0.00 | 44.6
cloud_cover_toulouse
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 62.2 ± 42.1
- Median ± IQR
- 87.0 ± 90.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_toulouse
Float32- Null values
- 14,246 (39.0%)
- Unique values
- 310 (0.8%)
- Mean ± Std
- 0.271 ± 0.0505
- Median ± IQR
- 0.285 ± 0.0530
- Min | Max
- 0.104 | 0.454
relative_humidity_2m_toulouse
Float32- Null values
- 0 (0.0%)
- Unique values
- 93 (0.3%)
- Mean ± Std
- 69.7 ± 18.7
- Median ± IQR
- 73.0 ± 29.0
- Min | Max
- 8.00 | 100.
temperature_2m_lille
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,080 (5.7%)
- Mean ± Std
- 12.2 ± 6.58
- Median ± IQR
- 11.8 ± 9.05
- Min | Max
- -6.32 | 40.8
precipitation_lille
Float32- Null values
- 0 (0.0%)
- Unique values
- 75 (0.2%)
- Mean ± Std
- 0.0977 ± 0.418
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 14.7
wind_speed_10m_lille
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,532 (6.9%)
- Mean ± Std
- 12.9 ± 6.60
- Median ± IQR
- 11.7 ± 8.47
- Min | Max
- 0.00 | 61.9
cloud_cover_lille
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 67.5 ± 40.4
- Median ± IQR
- 96.0 ± 78.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_lille
Float32- Null values
- 14,246 (39.0%)
- Unique values
- 209 (0.6%)
- Mean ± Std
- 0.306 ± 0.0315
- Median ± IQR
- 0.311 ± 0.0390
- Min | Max
- 0.203 | 0.422
relative_humidity_2m_lille
Float32- Null values
- 0 (0.0%)
- Unique values
- 96 (0.3%)
- Mean ± Std
- 74.7 ± 17.1
- Median ± IQR
- 79.0 ± 24.0
- Min | Max
- 0.00 | 100.
temperature_2m_limoges
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,572 (4.3%)
- Mean ± Std
- 12.7 ± 7.35
- Median ± IQR
- 12.1 ± 9.77
- Min | Max
- -7.70 | 39.7
precipitation_limoges
Float32- Null values
- 0 (0.0%)
- Unique values
- 153 (0.4%)
- Mean ± Std
- 0.123 ± 0.623
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 45.5
wind_speed_10m_limoges
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,359 (3.7%)
- Mean ± Std
- 7.58 ± 4.77
- Median ± IQR
- 6.52 ± 6.93
- Min | Max
- 0.00 | 33.9
cloud_cover_limoges
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 66.6 ± 40.8
- Median ± IQR
- 93.0 ± 81.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_limoges
Float32- Null values
- 14,246 (39.0%)
- Unique values
- 302 (0.8%)
- Mean ± Std
- 0.283 ± 0.0553
- Median ± IQR
- 0.298 ± 0.0650
- Min | Max
- 0.115 | 0.450
relative_humidity_2m_limoges
Float32- Null values
- 0 (0.0%)
- Unique values
- 93 (0.3%)
- Mean ± Std
- 75.2 ± 19.9
- Median ± IQR
- 81.0 ± 29.0
- Min | Max
- 8.00 | 100.
temperature_2m_nantes
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,539 (4.2%)
- Mean ± Std
- 13.8 ± 6.65
- Median ± IQR
- 13.4 ± 8.50
- Min | Max
- -3.86 | 43.4
precipitation_nantes
Float32- Null values
- 0 (0.0%)
- Unique values
- 112 (0.3%)
- Mean ± Std
- 0.0870 ± 0.437
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 14.1
wind_speed_10m_nantes
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,833 (7.8%)
- Mean ± Std
- 13.4 ± 6.91
- Median ± IQR
- 12.0 ± 8.37
- Min | Max
- 0.00 | 58.6
cloud_cover_nantes
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 65.1 ± 41.3
- Median ± IQR
- 94.0 ± 84.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_nantes
Float32- Null values
- 14,312 (39.2%)
- Unique values
- 314 (0.9%)
- Mean ± Std
- 0.276 ± 0.0657
- Median ± IQR
- 0.295 ± 0.0840
- Min | Max
- 0.110 | 0.423
relative_humidity_2m_nantes
Float32- Null values
- 0 (0.0%)
- Unique values
- 94 (0.3%)
- Mean ± Std
- 74.0 ± 17.3
- Median ± IQR
- 78.0 ± 25.0
- Min | Max
- 7.00 | 100.
temperature_2m_strasbourg
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,525 (4.2%)
- Mean ± Std
- 12.7 ± 7.74
- Median ± IQR
- 12.3 ± 11.0
- Min | Max
- -9.31 | 38.8
precipitation_strasbourg
Float32- Null values
- 0 (0.0%)
- Unique values
- 127 (0.3%)
- Mean ± Std
- 0.102 ± 0.510
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 22.1
wind_speed_10m_strasbourg
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,520 (4.2%)
- Mean ± Std
- 8.45 ± 5.05
- Median ± IQR
- 7.52 ± 6.94
- Min | Max
- 0.00 | 38.1
cloud_cover_strasbourg
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 69.7 ± 40.2
- Median ± IQR
- 98.0 ± 72.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_strasbourg
Float32- Null values
- 14,246 (39.0%)
- Unique values
- 304 (0.8%)
- Mean ± Std
- 0.329 ± 0.0519
- Median ± IQR
- 0.343 ± 0.0530
- Min | Max
- 0.159 | 0.468
relative_humidity_2m_strasbourg
Float32- Null values
- 0 (0.0%)
- Unique values
- 88 (0.2%)
- Mean ± Std
- 71.9 ± 18.5
- Median ± IQR
- 75.0 ± 28.0
- Min | Max
- 13.0 | 100.
temperature_2m_brest
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,265 (3.5%)
- Mean ± Std
- 13.0 ± 4.89
- Median ± IQR
- 12.6 ± 6.20
- Min | Max
- -2.33 | 40.5
precipitation_brest
Float32- Null values
- 0 (0.0%)
- Unique values
- 108 (0.3%)
- Mean ± Std
- 0.107 ± 0.432
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 12.7
wind_speed_10m_brest
Float32- Null values
- 0 (0.0%)
- Unique values
- 3,776 (10.3%)
- Mean ± Std
- 16.2 ± 8.89
- Median ± IQR
- 14.5 ± 11.8
- Min | Max
- 0.00 | 67.3
cloud_cover_brest
Float32- Null values
- 0 (0.0%)
- Unique values
- 102 (0.3%)
- Mean ± Std
- 67.9 ± 39.8
- Median ± IQR
- 96.0 ± 75.0
- Min | Max
- 0.00 | 101.
soil_moisture_1_to_3cm_brest
Float32- Null values
- 14,312 (39.2%)
- Unique values
- 279 (0.8%)
- Mean ± Std
- 0.267 ± 0.0571
- Median ± IQR
- 0.278 ± 0.0740
- Min | Max
- 0.116 | 0.409
relative_humidity_2m_brest
Float32- Null values
- 0 (0.0%)
- Unique values
- 90 (0.2%)
- Mean ± Std
- 78.2 ± 13.9
- Median ± IQR
- 81.0 ± 20.0
- Min | Max
- 10.0 | 100.
temperature_2m_bayonne
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,554 (4.3%)
- Mean ± Std
- 15.0 ± 6.40
- Median ± IQR
- 14.9 ± 8.47
- Min | Max
- -3.32 | 42.4
precipitation_bayonne
Float32- Null values
- 0 (0.0%)
- Unique values
- 131 (0.4%)
- Mean ± Std
- 0.145 ± 0.553
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 18.5
wind_speed_10m_bayonne
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,488 (6.8%)
- Mean ± Std
- 10.9 ± 6.72
- Median ± IQR
- 9.36 ± 8.11
- Min | Max
- 0.00 | 51.5
cloud_cover_bayonne
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 66.4 ± 40.7
- Median ± IQR
- 95.0 ± 80.0
- Min | Max
- -1.00 | 101.
soil_moisture_1_to_3cm_bayonne
Float32- Null values
- 14,312 (39.2%)
- Unique values
- 299 (0.8%)
- Mean ± Std
- 0.276 ± 0.0510
- Median ± IQR
- 0.284 ± 0.0470
- Min | Max
- 0.0970 | 0.414
relative_humidity_2m_bayonne
Float32- Null values
- 0 (0.0%)
- Unique values
- 91 (0.2%)
- Mean ± Std
- 76.2 ± 16.1
- Median ± IQR
- 79.0 ± 25.0
- Min | Max
- 9.00 | 100.
is_holiday_fr
Boolean- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
day_of_week_fr
Int8- Null values
- 0 (0.0%)
- Unique values
- 7 (< 0.1%)
- Mean ± Std
- 4.00 ± 2.00
- Median ± IQR
- 4.00 ± 4.00
- Min | Max
- 1.00 | 7.00
day_of_year_fr
Int16- Null values
- 0 (0.0%)
- Unique values
- 366 (1.0%)
- Mean ± Std
- 181. ± 104.
- Median ± IQR
- 175. ± 177.
- Min | Max
- 1.00 | 366.
hour_of_day_fr
Int8- Null values
- 0 (0.0%)
- Unique values
- 24 (< 0.1%)
- Mean ± Std
- 11.5 ± 6.92
- Median ± IQR
- 12.0 ± 11.0
- Min | Max
- 0.00 | 23.0
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | load_mw | Float64 | 0 (0.0%) | 23274 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 1 | load_mw_lag_1h | Float64 | 0 (0.0%) | 23274 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 2 | load_mw_lag_2h | Float64 | 0 (0.0%) | 23274 (63.7%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 3 | load_mw_lag_3h | Float64 | 0 (0.0%) | 23274 (63.7%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 4 | load_mw_lag_1d | Float64 | 0 (0.0%) | 23275 (63.7%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 5 | load_mw_lag_1w | Float64 | 0 (0.0%) | 23283 (63.7%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.82e+04 | 8.66e+04 |
| 6 | load_mw_rolling_median_24h | Float64 | 0 (0.0%) | 9599 (26.3%) | 5.05e+04 | 9.29e+03 | 3.37e+04 | 4.74e+04 | 7.84e+04 |
| 7 | load_mw_rolling_median_7d | Float64 | 0 (0.0%) | 7137 (19.5%) | 5.01e+04 | 8.81e+03 | 3.85e+04 | 4.60e+04 | 7.39e+04 |
| 8 | load_mw_iqr_24h | Float64 | 0 (0.0%) | 5908 (16.2%) | 6.52e+03 | 1.56e+03 | 2.32e+03 | 6.43e+03 | 1.60e+04 |
| 9 | load_mw_iqr_7d | Float64 | 0 (0.0%) | 5327 (14.6%) | 8.30e+03 | 1.41e+03 | 5.04e+03 | 8.28e+03 | 1.86e+04 |
| 10 | temperature_2m_paris | Float32 | 0 (0.0%) | 1439 (3.9%) | 13.6 | 7.00 | -5.13 | 13.2 | 40.6 |
| 11 | precipitation_paris | Float32 | 0 (0.0%) | 135 (0.4%) | 0.0914 | 0.543 | 0.00 | 0.00 | 29.5 |
| 12 | wind_speed_10m_paris | Float32 | 0 (0.0%) | 1774 (4.9%) | 10.0 | 5.28 | 0.00 | 9.29 | 50.1 |
| 13 | cloud_cover_paris | Float32 | 0 (0.0%) | 103 (0.3%) | 69.2 | 39.9 | -1.00 | 97.0 | 101. |
| 14 | soil_moisture_1_to_3cm_paris | Float32 | 14246 (39.0%) | 277 (0.8%) | 0.298 | 0.0397 | 0.139 | 0.304 | 0.436 |
| 15 | relative_humidity_2m_paris | Float32 | 0 (0.0%) | 91 (0.2%) | 69.7 | 18.1 | 10.0 | 73.0 | 100. |
| 16 | temperature_2m_lyon | Float32 | 0 (0.0%) | 1565 (4.3%) | 14.1 | 7.97 | -5.89 | 13.8 | 40.3 |
| 17 | precipitation_lyon | Float32 | 0 (0.0%) | 150 (0.4%) | 0.0993 | 0.609 | 0.00 | 0.00 | 26.3 |
| 18 | wind_speed_10m_lyon | Float32 | 0 (0.0%) | 1727 (4.7%) | 8.08 | 6.05 | 0.00 | 6.48 | 43.2 |
| 19 | cloud_cover_lyon | Float32 | 0 (0.0%) | 103 (0.3%) | 64.5 | 41.8 | -1.00 | 92.0 | 101. |
| 20 | soil_moisture_1_to_3cm_lyon | Float32 | 14246 (39.0%) | 290 (0.8%) | 0.296 | 0.0380 | 0.124 | 0.304 | 0.441 |
| 21 | relative_humidity_2m_lyon | Float32 | 0 (0.0%) | 89 (0.2%) | 68.6 | 18.7 | 12.0 | 71.0 | 100. |
| 22 | temperature_2m_marseille | Float32 | 0 (0.0%) | 1276 (3.5%) | 17.5 | 6.15 | 0.317 | 17.1 | 36.6 |
| 23 | precipitation_marseille | Float32 | 0 (0.0%) | 103 (0.3%) | 0.0514 | 0.382 | 0.00 | 0.00 | 21.0 |
| 24 | wind_speed_10m_marseille | Float32 | 0 (0.0%) | 4018 (11.0%) | 14.9 | 10.8 | 0.00 | 11.8 | 74.6 |
| 25 | cloud_cover_marseille | Float32 | 0 (0.0%) | 103 (0.3%) | 46.6 | 44.5 | -1.00 | 31.0 | 101. |
| 26 | soil_moisture_1_to_3cm_marseille | Float32 | 14312 (39.2%) | 354 (1.0%) | 0.227 | 0.0748 | 0.100 | 0.223 | 0.459 |
| 27 | relative_humidity_2m_marseille | Float32 | 0 (0.0%) | 86 (0.2%) | 63.4 | 13.2 | 14.0 | 64.0 | 99.0 |
| 28 | temperature_2m_toulouse | Float32 | 0 (0.0%) | 1513 (4.1%) | 15.2 | 7.48 | -5.33 | 14.6 | 41.2 |
| 29 | precipitation_toulouse | Float32 | 0 (0.0%) | 121 (0.3%) | 0.0740 | 0.587 | 0.00 | 0.00 | 36.9 |
| 30 | wind_speed_10m_toulouse | Float32 | 0 (0.0%) | 2123 (5.8%) | 9.88 | 6.48 | 0.00 | 8.65 | 44.6 |
| 31 | cloud_cover_toulouse | Float32 | 0 (0.0%) | 103 (0.3%) | 62.2 | 42.1 | -1.00 | 87.0 | 101. |
| 32 | soil_moisture_1_to_3cm_toulouse | Float32 | 14246 (39.0%) | 310 (0.8%) | 0.271 | 0.0505 | 0.104 | 0.285 | 0.454 |
| 33 | relative_humidity_2m_toulouse | Float32 | 0 (0.0%) | 93 (0.3%) | 69.7 | 18.7 | 8.00 | 73.0 | 100. |
| 34 | temperature_2m_lille | Float32 | 0 (0.0%) | 2080 (5.7%) | 12.2 | 6.58 | -6.32 | 11.8 | 40.8 |
| 35 | precipitation_lille | Float32 | 0 (0.0%) | 75 (0.2%) | 0.0977 | 0.418 | 0.00 | 0.00 | 14.7 |
| 36 | wind_speed_10m_lille | Float32 | 0 (0.0%) | 2532 (6.9%) | 12.9 | 6.60 | 0.00 | 11.7 | 61.9 |
| 37 | cloud_cover_lille | Float32 | 0 (0.0%) | 103 (0.3%) | 67.5 | 40.4 | -1.00 | 96.0 | 101. |
| 38 | soil_moisture_1_to_3cm_lille | Float32 | 14246 (39.0%) | 209 (0.6%) | 0.306 | 0.0315 | 0.203 | 0.311 | 0.422 |
| 39 | relative_humidity_2m_lille | Float32 | 0 (0.0%) | 96 (0.3%) | 74.7 | 17.1 | 0.00 | 79.0 | 100. |
| 40 | temperature_2m_limoges | Float32 | 0 (0.0%) | 1572 (4.3%) | 12.7 | 7.35 | -7.70 | 12.1 | 39.7 |
| 41 | precipitation_limoges | Float32 | 0 (0.0%) | 153 (0.4%) | 0.123 | 0.623 | 0.00 | 0.00 | 45.5 |
| 42 | wind_speed_10m_limoges | Float32 | 0 (0.0%) | 1359 (3.7%) | 7.58 | 4.77 | 0.00 | 6.52 | 33.9 |
| 43 | cloud_cover_limoges | Float32 | 0 (0.0%) | 103 (0.3%) | 66.6 | 40.8 | -1.00 | 93.0 | 101. |
| 44 | soil_moisture_1_to_3cm_limoges | Float32 | 14246 (39.0%) | 302 (0.8%) | 0.283 | 0.0553 | 0.115 | 0.298 | 0.450 |
| 45 | relative_humidity_2m_limoges | Float32 | 0 (0.0%) | 93 (0.3%) | 75.2 | 19.9 | 8.00 | 81.0 | 100. |
| 46 | temperature_2m_nantes | Float32 | 0 (0.0%) | 1539 (4.2%) | 13.8 | 6.65 | -3.86 | 13.4 | 43.4 |
| 47 | precipitation_nantes | Float32 | 0 (0.0%) | 112 (0.3%) | 0.0870 | 0.437 | 0.00 | 0.00 | 14.1 |
| 48 | wind_speed_10m_nantes | Float32 | 0 (0.0%) | 2833 (7.8%) | 13.4 | 6.91 | 0.00 | 12.0 | 58.6 |
| 49 | cloud_cover_nantes | Float32 | 0 (0.0%) | 103 (0.3%) | 65.1 | 41.3 | -1.00 | 94.0 | 101. |
| 50 | soil_moisture_1_to_3cm_nantes | Float32 | 14312 (39.2%) | 314 (0.9%) | 0.276 | 0.0657 | 0.110 | 0.295 | 0.423 |
| 51 | relative_humidity_2m_nantes | Float32 | 0 (0.0%) | 94 (0.3%) | 74.0 | 17.3 | 7.00 | 78.0 | 100. |
| 52 | temperature_2m_strasbourg | Float32 | 0 (0.0%) | 1525 (4.2%) | 12.7 | 7.74 | -9.31 | 12.3 | 38.8 |
| 53 | precipitation_strasbourg | Float32 | 0 (0.0%) | 127 (0.3%) | 0.102 | 0.510 | 0.00 | 0.00 | 22.1 |
| 54 | wind_speed_10m_strasbourg | Float32 | 0 (0.0%) | 1520 (4.2%) | 8.45 | 5.05 | 0.00 | 7.52 | 38.1 |
| 55 | cloud_cover_strasbourg | Float32 | 0 (0.0%) | 103 (0.3%) | 69.7 | 40.2 | -1.00 | 98.0 | 101. |
| 56 | soil_moisture_1_to_3cm_strasbourg | Float32 | 14246 (39.0%) | 304 (0.8%) | 0.329 | 0.0519 | 0.159 | 0.343 | 0.468 |
| 57 | relative_humidity_2m_strasbourg | Float32 | 0 (0.0%) | 88 (0.2%) | 71.9 | 18.5 | 13.0 | 75.0 | 100. |
| 58 | temperature_2m_brest | Float32 | 0 (0.0%) | 1265 (3.5%) | 13.0 | 4.89 | -2.33 | 12.6 | 40.5 |
| 59 | precipitation_brest | Float32 | 0 (0.0%) | 108 (0.3%) | 0.107 | 0.432 | 0.00 | 0.00 | 12.7 |
| 60 | wind_speed_10m_brest | Float32 | 0 (0.0%) | 3776 (10.3%) | 16.2 | 8.89 | 0.00 | 14.5 | 67.3 |
| 61 | cloud_cover_brest | Float32 | 0 (0.0%) | 102 (0.3%) | 67.9 | 39.8 | 0.00 | 96.0 | 101. |
| 62 | soil_moisture_1_to_3cm_brest | Float32 | 14312 (39.2%) | 279 (0.8%) | 0.267 | 0.0571 | 0.116 | 0.278 | 0.409 |
| 63 | relative_humidity_2m_brest | Float32 | 0 (0.0%) | 90 (0.2%) | 78.2 | 13.9 | 10.0 | 81.0 | 100. |
| 64 | temperature_2m_bayonne | Float32 | 0 (0.0%) | 1554 (4.3%) | 15.0 | 6.40 | -3.32 | 14.9 | 42.4 |
| 65 | precipitation_bayonne | Float32 | 0 (0.0%) | 131 (0.4%) | 0.145 | 0.553 | 0.00 | 0.00 | 18.5 |
| 66 | wind_speed_10m_bayonne | Float32 | 0 (0.0%) | 2488 (6.8%) | 10.9 | 6.72 | 0.00 | 9.36 | 51.5 |
| 67 | cloud_cover_bayonne | Float32 | 0 (0.0%) | 103 (0.3%) | 66.4 | 40.7 | -1.00 | 95.0 | 101. |
| 68 | soil_moisture_1_to_3cm_bayonne | Float32 | 14312 (39.2%) | 299 (0.8%) | 0.276 | 0.0510 | 0.0970 | 0.284 | 0.414 |
| 69 | relative_humidity_2m_bayonne | Float32 | 0 (0.0%) | 91 (0.2%) | 76.2 | 16.1 | 9.00 | 79.0 | 100. |
| 70 | is_holiday_fr | Boolean | 0 (0.0%) | 2 (< 0.1%) | |||||
| 71 | day_of_week_fr | Int8 | 0 (0.0%) | 7 (< 0.1%) | 4.00 | 2.00 | 1.00 | 4.00 | 7.00 |
| 72 | day_of_year_fr | Int16 | 0 (0.0%) | 366 (1.0%) | 181. | 104. | 1.00 | 175. | 366. |
| 73 | hour_of_day_fr | Int8 | 0 (0.0%) | 24 (< 0.1%) | 11.5 | 6.92 | 0.00 | 12.0 | 23.0 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
horizons = range(1, 3) # Forecasting horizons from 1 to 24 hours
target_column_name_pattern = "load_mw_horizon_{horizon}h"
targets = prediction_time.join(
electricity.with_columns(
[
pl.col("load_mw")
.shift(-h)
.alias(target_column_name_pattern.format(horizon=h))
for h in horizons
]
),
left_on="prediction_time",
right_on="time",
)
horizon_of_interest = 2
target_column_name = target_column_name_pattern.format(horizon=horizon_of_interest)
predicted_target_column_name = "predicted_" + target_column_name
target = targets[target_column_name].skb.mark_as_y()
from sklearn.ensemble import HistGradientBoostingRegressor
predictions = features.skb.apply(
HistGradientBoostingRegressor(
random_state=0,
learning_rate=skrub.choose_float(
0.01, 0.9, default=0.1, log=True, name="learning_rate"
),
max_leaf_nodes=skrub.choose_int(
3, 300, default=30, log=True, name="max_leaf_nodes"
),
),
y=target,
)
predictions
Show graph
| load_mw_horizon_2h |
|---|
| 43785.582565286044 |
| 47017.314094100424 |
| 50859.031142652675 |
| 54286.464153657565 |
| 57482.8852727483 |
| 42679.87717441464 |
| 40822.15317467512 |
| 38100.24131858226 |
| 37004.48080729353 |
| 33820.92691967832 |
load_mw_horizon_2h
Float64- Null values
- 0 (0.0%)
- Unique values
- 36,550 (100.0%)
- Mean ± Std
- 4.98e+04 ± 1.04e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 3.01e+04 | 8.45e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | load_mw_horizon_2h | Float64 | 0 (0.0%) | 36550 (100.0%) | 4.98e+04 | 1.04e+04 | 3.01e+04 | 4.81e+04 | 8.45e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
altair.Chart(
pl.concat(
[
targets.skb.eval(),
predictions.rename(
{target_column_name: predicted_target_column_name}
).skb.eval(),
],
how="horizontal",
).tail(24 * 7)
).transform_fold(
[target_column_name, predicted_target_column_name],
).mark_line(
tooltip=True
).encode(
x="prediction_time:T", y="value:Q", color="key:N"
).interactive()
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import make_scorer, mean_absolute_percentage_error, get_scorer
mape_scorer = make_scorer(mean_absolute_percentage_error)
ts_cv_5 = TimeSeriesSplit(n_splits=5, max_train_size=10_000, gap=24)
predictions.skb.cross_validate(
cv=ts_cv_5,
scoring={
"r2": get_scorer("r2"),
"mape": mape_scorer,
},
return_train_score=True,
verbose=1,
n_jobs=-1,
).round(3)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done 5 out of 5 | elapsed: 5.4s finished
| fit_time | score_time | test_r2 | train_r2 | test_mape | train_mape | |
|---|---|---|---|---|---|---|
| 0 | 1.086 | 0.066 | 0.970 | 0.997 | 0.021 | 0.008 |
| 1 | 1.364 | 0.067 | 0.983 | 0.996 | 0.020 | 0.010 |
| 2 | 1.432 | 0.053 | 0.972 | 0.995 | 0.025 | 0.010 |
| 3 | 1.399 | 0.066 | 0.973 | 0.994 | 0.023 | 0.012 |
| 4 | 1.058 | 0.038 | 0.977 | 0.993 | 0.023 | 0.014 |
ts_cv_3 = TimeSeriesSplit(n_splits=3, max_train_size=10_000, gap=24)
randomized_search = predictions.skb.get_randomized_search(
cv=ts_cv_3,
scoring="r2",
n_iter=30,
fitted=True,
verbose=1,
n_jobs=-1,
)
randomized_search.results_
Fitting 3 folds for each of 30 candidates, totalling 90 fits
| learning_rate | max_leaf_nodes | mean_test_score | |
|---|---|---|---|
| 0 | 0.101497 | 126 | 0.980176 |
| 1 | 0.079763 | 40 | 0.980106 |
| 2 | 0.074376 | 185 | 0.979518 |
| 3 | 0.056106 | 231 | 0.978974 |
| 4 | 0.158866 | 187 | 0.978394 |
| 5 | 0.177116 | 12 | 0.977895 |
| 6 | 0.181058 | 8 | 0.977489 |
| 7 | 0.288475 | 18 | 0.976352 |
| 8 | 0.343255 | 20 | 0.976326 |
| 9 | 0.102882 | 8 | 0.976052 |
| 10 | 0.339992 | 39 | 0.975655 |
| 11 | 0.397618 | 7 | 0.975605 |
| 12 | 0.308953 | 224 | 0.975359 |
| 13 | 0.186020 | 5 | 0.974518 |
| 14 | 0.448934 | 6 | 0.973496 |
| 15 | 0.370166 | 4 | 0.973071 |
| 16 | 0.617871 | 4 | 0.972331 |
| 17 | 0.480384 | 17 | 0.971334 |
| 18 | 0.508125 | 14 | 0.971045 |
| 19 | 0.542180 | 207 | 0.969476 |
| 20 | 0.588342 | 84 | 0.968365 |
| 21 | 0.204359 | 3 | 0.966061 |
| 22 | 0.684306 | 24 | 0.962164 |
| 23 | 0.889901 | 11 | 0.960339 |
| 24 | 0.045399 | 5 | 0.950639 |
| 25 | 0.879902 | 87 | 0.949598 |
| 26 | 0.051006 | 3 | 0.934190 |
| 27 | 0.022310 | 8 | 0.918392 |
| 28 | 0.014183 | 186 | 0.910541 |
| 29 | 0.013296 | 228 | 0.898933 |
randomized_search.plot_results()
# nested_cv_results = skrub.cross_validate(
# environment=predictions.skb.get_data(),
# pipeline=randomized_search,
# cv=ts_cv_5,
# scoring={
# "r2": get_scorer("r2"),
# "mape": mape_scorer,
# },
# n_jobs=-1,
# return_pipeline=True,
# ).round(3)
# nested_cv_results
# for outer_cv_idx in range(len(nested_cv_results)):
# print(
# nested_cv_results.loc[outer_cv_idx, "pipeline"]
# .results_.loc[0]
# .round(3)
# .to_dict()
# )
# from joblib import Parallel, delayed
# cv_predictions = []
# for ts_cv_train_idx, ts_cv_test_idx in ts_cv_5.split(prediction_time.skb.eval()):
# features[ts_cv_train_idx].fit
predictions = features.skb.apply(
skrub.SelectCols(
cols=skrub.choose_from(
[
skrub.selectors.all(),
skrub.selectors.filter_names(
lambda name: name.startswith("load_mw_"),
),
skrub.selectors.filter_names(
lambda name: name.startswith("load_mw_")
or name.startswith("temperature_"),
),
skrub.selectors.filter_names(
lambda name: name.startswith("load_mw_")
or name.startswith("precipitation_"),
),
skrub.selectors.filter_names(
lambda name: name.startswith("load_mw_")
or name.startswith("wind_speed_"),
),
skrub.selectors.filter_names(
lambda name: name.startswith("load_mw_")
or name.startswith("cloud_cover_"),
),
# calendar features
skrub.selectors.filter_names(
lambda name: name.startswith("load_mw_") or name.endswith("_fr"),
),
],
name="feature_subset",
)
)
).skb.apply(
HistGradientBoostingRegressor(
random_state=0,
learning_rate=skrub.choose_float(
0.01, 0.9, default=0.1, log=True, name="learning_rate"
),
),
y=target,
)
randomized_search = predictions.skb.get_randomized_search(
cv=ts_cv_3,
scoring="r2",
n_iter=30,
fitted=True,
verbose=1,
n_jobs=-1,
)
randomized_search.results_
Fitting 3 folds for each of 30 candidates, totalling 90 fits
| feature_subset | learning_rate | mean_test_score | |
|---|---|---|---|
| 0 | all() | 0.099536 | 0.980097 |
| 1 | all() | 0.067589 | 0.979463 |
| 2 | filter_names(<lambda>) | 0.078805 | 0.969532 |
| 3 | filter_names(<lambda>) | 0.218262 | 0.968531 |
| 4 | filter_names(<lambda>) | 0.399262 | 0.962610 |
| 5 | all() | 0.026653 | 0.962475 |
| 6 | filter_names(<lambda>) | 0.036151 | 0.961864 |
| 7 | filter_names(<lambda>) | 0.667629 | 0.952211 |
| 8 | filter_names(<lambda>) | 0.019793 | 0.926310 |
| 9 | filter_names(<lambda>) | 0.176460 | 0.902561 |
| 10 | filter_names(<lambda>) | 0.171368 | 0.902167 |
| 11 | filter_names(<lambda>) | 0.088878 | 0.900799 |
| 12 | filter_names(<lambda>) | 0.052764 | 0.900393 |
| 13 | filter_names(<lambda>) | 0.115793 | 0.899440 |
| 14 | filter_names(<lambda>) | 0.048053 | 0.898641 |
| 15 | filter_names(<lambda>) | 0.040954 | 0.898554 |
| 16 | filter_names(<lambda>) | 0.117140 | 0.898435 |
| 17 | filter_names(<lambda>) | 0.037720 | 0.897170 |
| 18 | filter_names(<lambda>) | 0.037271 | 0.895901 |
| 19 | filter_names(<lambda>) | 0.317319 | 0.894193 |
| 20 | filter_names(<lambda>) | 0.030575 | 0.893030 |
| 21 | filter_names(<lambda>) | 0.338578 | 0.891036 |
| 22 | filter_names(<lambda>) | 0.025810 | 0.889244 |
| 23 | filter_names(<lambda>) | 0.370924 | 0.889087 |
| 24 | filter_names(<lambda>) | 0.025857 | 0.886896 |
| 25 | filter_names(<lambda>) | 0.348733 | 0.884695 |
| 26 | filter_names(<lambda>) | 0.500632 | 0.880784 |
| 27 | filter_names(<lambda>) | 0.464976 | 0.874003 |
| 28 | filter_names(<lambda>) | 0.013596 | 0.866150 |
| 29 | filter_names(<lambda>) | 0.805895 | 0.833898 |
# nested_cv_results = skrub.cross_validate(
# environment=predictions.skb.get_data(),
# pipeline=randomized_search,
# cv=ts_cv_5,
# scoring={
# "r2": get_scorer("r2"),
# "mape": mape_scorer,
# },
# n_jobs=-1,
# return_pipeline=True,
# ).round(3)
# nested_cv_results
targets = targets.skb.drop(cols=["load_mw"]).skb.mark_as_y()
from sklearn.multioutput import MultiOutputRegressor
model = MultiOutputRegressor(
estimator=HistGradientBoostingRegressor(
random_state=0,
learning_rate=skrub.choose_float(
0.01, 0.9, default=0.1, log=True, name="learning_rate"
),
),
)
predictions = features.skb.apply(model, y=targets)
from sklearn.metrics import r2_score
def multioutput_scorer(regressor, X, y, score_func, score_name):
y_pred = regressor.predict(X)
return {
f"{score_name}_horizon_{h}h": score
for h, score in enumerate(
score_func(y, y_pred, multioutput="raw_values"), start=1
)
}
def scoring(regressor, X, y):
return {
**multioutput_scorer(regressor, X, y, mean_absolute_percentage_error, "mape"),
**multioutput_scorer(regressor, X, y, r2_score, "r2"),
}
predictions.skb.cross_validate(
cv=ts_cv_5,
scoring=scoring,
return_train_score=True,
verbose=1,
n_jobs=-1,
).round(3)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done 5 out of 5 | elapsed: 7.6s finished
| fit_time | score_time | test_mape_horizon_1h | train_mape_horizon_1h | test_mape_horizon_2h | train_mape_horizon_2h | test_mape_horizon_3h | train_mape_horizon_3h | test_r2_horizon_1h | train_r2_horizon_1h | test_r2_horizon_2h | train_r2_horizon_2h | test_r2_horizon_3h | train_r2_horizon_3h | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3.312 | 0.318 | 0.017 | 0.0 | 0.013 | 0.006 | 0.020 | 0.008 | -19.596 | 1.000 | 0.988 | 0.998 | 0.974 | 0.997 |
| 1 | 3.869 | 0.322 | 0.019 | 0.0 | 0.012 | 0.007 | 0.020 | 0.010 | -24.166 | 0.998 | 0.993 | 0.998 | 0.983 | 0.996 |
| 2 | 4.326 | 0.308 | 0.010 | 0.0 | 0.016 | 0.007 | 0.025 | 0.010 | -8.691 | 1.000 | 0.987 | 0.998 | 0.973 | 0.995 |
| 3 | 3.975 | 0.332 | 0.018 | 0.0 | 0.015 | 0.009 | 0.023 | 0.012 | -23.756 | 0.999 | 0.988 | 0.997 | 0.973 | 0.995 |
| 4 | 2.868 | 0.185 | 0.018 | 0.0 | 0.014 | 0.009 | 0.023 | 0.013 | -23.643 | 0.999 | 0.991 | 0.997 | 0.977 | 0.993 |